Transcriptomic Neural Networks Architecture and Applications to Functional and Aging Research

Foundation models have become key to Large Language Model (LLM) architectures, leveraging the great corpus of text available on the internet. Advances in transcriptomic foundation models (TFMs) and exponentially increasing data availability are contributing to the same trend in biology. Here the authors describe scFoundation, the largest TFM in literature, having been pretrained on 50 million single-cell transcriptomic profiles and totalling 100 million parameters. A transformer-like asymmetric encoder- decoder architecture was trained on a read-depth aware (RDA) de-masking task. The model has been applied to several downstream tasks, showing that its improved generalization yields better performance across gene, cell, and cell line domains. State-of-the-art performance was shown for read- depth enhancement, drug response prediction, cell type annotation, gene perturbation response prediction, gene module and GRN inference.

Transcriptomic Neural Networks Architecture and Applications to Functional and Aging Research

PINAROLI, ANDREA

2024/2025

Abstract

Foundation models have become key to Large Language Model (LLM) architectures, leveraging the great corpus of text available on the internet. Advances in transcriptomic foundation models (TFMs) and exponentially increasing data availability are contributing to the same trend in biology. Here the authors describe scFoundation, the largest TFM in literature, having been pretrained on 50 million single-cell transcriptomic profiles and totalling 100 million parameters. A transformer-like asymmetric encoder- decoder architecture was trained on a read-depth aware (RDA) de-masking task. The model has been applied to several downstream tasks, showing that its improved generalization yields better performance across gene, cell, and cell line domains. State-of-the-art performance was shown for read- depth enhancement, drug response prediction, cell type annotation, gene perturbation response prediction, gene module and GRN inference.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Biologia - DiBio
			
	Corso di studio
	
				BIOLOGIA MOLECOLARE Laurea di Primo Livello (D.M. 270/2004)
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				Transcriptomic Neural Networks Architecture and Applications to Functional and Aging Research
			
	Abstract in italiano
	
				Foundation models have become key to Large Language Model (LLM) architectures, leveraging the great corpus of text available on the internet. Advances in transcriptomic foundation models (TFMs) and exponentially increasing data availability are contributing to the same trend in biology. Here the authors describe scFoundation, the largest TFM in literature, having been pretrained on 50 million single-cell transcriptomic profiles and totalling 100 million parameters. A transformer-like asymmetric encoder- decoder architecture was trained on a read-depth aware (RDA) de-masking task. The model has been applied to several downstream tasks, showing that its improved generalization yields better performance across gene, cell, and cell line domains. State-of-the-art performance was shown for read- depth enhancement, drug response prediction, cell type annotation, gene perturbation response prediction, gene module and GRN inference.
			
	Parola chiave
	
				Neural Networks
Aging Clocks
TFMs
			
	Relatore
	
				PAVANELLO, SOFIA
			
	Appare nelle tipologie:
	
				Lauree triennali

File in questo prodotto:

File	Dimensione	Formato
Pinaroli_Andrea.pdf accesso aperto Dimensione 6.22 MB Formato Adobe PDF Visualizza/Apri	6.22 MB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/91971