Statistical Analysis of Protein Microarray Data

Microarray data analysis represents one of the clearest examples of the highly beneficial interaction between bioinformatics and statistics. Protein microarrays are powerful tools for high-throughput studies of the human proteome; however, both systematic and non-systematic sources of bias can limit the optimal interpretation and ultimate utility of the data. In a typical protein microarray experiment, the number of samples is often limited, while the number of features in the raw data can exceed 60,000. In the case under consideration, there are 10 protein microarray datasets—one for each patient—comprising 5 patients diagnosed with Desmoplastic Small Round Cell Tumor (DSCRT) and 5 healthy controls, with each dataset containing 52,736 rows. To extract meaningful information from this high-dimensional data, various data pre-processing and statistical inference techniques will be employed. In particular, different methods for background correction and normalization will be evaluated to reduce technical noise and ensure robust data analysis. Subsequently, non-parametric tests will be applied for statistical inference, followed by p-value correction to account for multiple comparisons. The goal is to identify biomarkers that can highlight differences between diseased and healthy patients.

Statistical Analysis of Protein Microarray Data

MISINO, CARLO

2023/2024

Abstract

Microarray data analysis represents one of the clearest examples of the highly beneficial interaction between bioinformatics and statistics. Protein microarrays are powerful tools for high-throughput studies of the human proteome; however, both systematic and non-systematic sources of bias can limit the optimal interpretation and ultimate utility of the data. In a typical protein microarray experiment, the number of samples is often limited, while the number of features in the raw data can exceed 60,000. In the case under consideration, there are 10 protein microarray datasets—one for each patient—comprising 5 patients diagnosed with Desmoplastic Small Round Cell Tumor (DSCRT) and 5 healthy controls, with each dataset containing 52,736 rows. To extract meaningful information from this high-dimensional data, various data pre-processing and statistical inference techniques will be employed. In particular, different methods for background correction and normalization will be evaluated to reduce technical noise and ensure robust data analysis. Subsequently, non-parametric tests will be applied for statistical inference, followed by p-value correction to account for multiple comparisons. The goal is to identify biomarkers that can highlight differences between diseased and healthy patients.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Scienze Statistiche
			
	Corso di studio
	
				STATISTICA PER L'ECONOMIA E L'IMPRESA Laurea di Primo Livello (D.M. 270/2004)
			
	Anno Accademico
	
				2023
			
	Titolo inglese
	
				Statistical Analysis of Protein Microarray Data
			
	Abstract in italiano
	
				Microarray data analysis represents one of the clearest examples of the highly beneficial interaction between bioinformatics and statistics. Protein microarrays are powerful tools for high-throughput studies of the human proteome; however, both systematic and non-systematic sources of bias can limit the optimal interpretation and ultimate utility of the data. In a typical protein microarray experiment, the number of samples is often limited, while the number of features in the raw data can exceed 60,000.

In the case under consideration, there are 10 protein microarray datasets—one for each patient—comprising 5 patients diagnosed with Desmoplastic Small Round Cell Tumor (DSCRT) and 5 healthy controls, with each dataset containing 52,736 rows. To extract meaningful information from this high-dimensional data, various data pre-processing and statistical inference techniques will be employed. In particular, different methods for background correction and normalization will be evaluated to reduce technical noise and ensure robust data analysis. Subsequently, non-parametric tests will be applied for statistical inference, followed by p-value correction to account for multiple comparisons. The goal is to identify biomarkers that can highlight differences between diseased and healthy patients.
			
	Parola chiave
	
				Microarray
Normalization
Background
DSCRT
Data analysis
			
	Relatore
	
				CATTELAN, MANUELA
			
	Appare nelle tipologie:
	
				Lauree triennali

File in questo prodotto:

File	Dimensione	Formato
Misino_Carlo.pdf Accesso riservato Dimensione 853.38 kB Formato Adobe PDF	853.38 kB	Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/77691