Co-expressed protein modules in single-cell proteomics: an exploratory method comparison

Single-Cell Proteomics (SCP) is a scientific field focused on studying protein expressions in individual cells. Over the past decade, SCP has gained importance over genome and transcriptome studies due to the functional role of proteins, whose expression levels could provide interesting insights when comparing cells under different conditions, cell types and other variables. However, SCP faces major challenges related both to the nature of this data and to the difficulty of its acquisition. One of the most critical issues is data sparsity or missingness, aggravated by the limitations of the current available imputation techniques, often generating considerable biases. Traditional omics analysis usually employes data modelling to identify differentially expressed proteins across conditions, followed by a Gene Set Analysis to give the results a biological interpretation. The main objective of this thesis is to explore and discuss the use of clustering, biclustering and community detection methods to identify modules of co-expressed proteins in cells of the same type (from two example datasets), as an alternative to univariate protein modelling. The main algorithms used are k-means, QUBIC and Leiden community detection. Furthermore, a statistical test based on a non-parametric null distribution of cluster silhouettes is implemented in order to identify the most relevant clusters. The relevant clusters reported from the different methods are then enriched via Fisher over-representation test and compared to evaluate their biological significance.

Co-expressed protein modules in single-cell proteomics: an exploratory method comparison

MENNA, EMMA

2024/2025

Abstract

Single-Cell Proteomics (SCP) is a scientific field focused on studying protein expressions in individual cells. Over the past decade, SCP has gained importance over genome and transcriptome studies due to the functional role of proteins, whose expression levels could provide interesting insights when comparing cells under different conditions, cell types and other variables. However, SCP faces major challenges related both to the nature of this data and to the difficulty of its acquisition. One of the most critical issues is data sparsity or missingness, aggravated by the limitations of the current available imputation techniques, often generating considerable biases. Traditional omics analysis usually employes data modelling to identify differentially expressed proteins across conditions, followed by a Gene Set Analysis to give the results a biological interpretation. The main objective of this thesis is to explore and discuss the use of clustering, biclustering and community detection methods to identify modules of co-expressed proteins in cells of the same type (from two example datasets), as an alternative to univariate protein modelling. The main algorithms used are k-means, QUBIC and Leiden community detection. Furthermore, a statistical test based on a non-parametric null distribution of cluster silhouettes is implemented in order to identify the most relevant clusters. The relevant clusters reported from the different methods are then enriched via Fisher over-representation test and compared to evaluate their biological significance.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Scienze Statistiche
			
	Corso di studio
	
				SCIENZE STATISTICHE Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				Co-expressed protein modules in single-cell proteomics: an exploratory method comparison
			
	Parola chiave
	
				Proteomics
Clustering methods
Community detection
			
	Relatore
	
				RISSO, DAVIDE
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Menna_Emma.pdf accesso aperto Dimensione 3.82 MB Formato Adobe PDF Visualizza/Apri	3.82 MB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/84087