Interpreting Disordered Proteins using Sparse Auto-Encoders and Protein Language Models

Protein language models have become powerful tools for representing protein sequences and supporting a wide range of computational biology tasks. Despite their strong performance, the internal representations learned by these models remain difficult to interpret, especially for intrinsically disordered proteins. This thesis investigates how protein language models encode information related to intrinsically disordered regions and explores whether these representations can be interpreted in a biologically meaningful way. Therefore, sparse autoencoders are applied to intermediate representations extracted from a protein language model. By enforcing sparsity, the autoencoder decomposes high-dimensional model activations into a set of latent features that can be analyzed individually. The study focuses on residue-level representations and examines the relationship between sparse latent activations and biological annotations associated with protein disorder. Statistical analyses and linear probing methods are employed to evaluate whether specific latent features capture signals relevant to intrinsically disordered regions. This work follows an exploratory and interpretability-driven approach. Different model layers and sparse representations are compared in order to better understand how disorder-related information emerges within protein language models. The results provide insights into the structure of learned representations and highlight the potential of sparse autoencoders as a tool for improving interpretability in computational protein analysis. Overall, this thesis contributes to interpretation of representations related to intrinsically disordered proteins bu using different machine learning models and latent features.

Interpreting Disordered Proteins using Sparse Auto-Encoders and Protein Language Models

BACAKSIZ, ONUR

2025/2026

Abstract

Protein language models have become powerful tools for representing protein sequences and supporting a wide range of computational biology tasks. Despite their strong performance, the internal representations learned by these models remain difficult to interpret, especially for intrinsically disordered proteins. This thesis investigates how protein language models encode information related to intrinsically disordered regions and explores whether these representations can be interpreted in a biologically meaningful way. Therefore, sparse autoencoders are applied to intermediate representations extracted from a protein language model. By enforcing sparsity, the autoencoder decomposes high-dimensional model activations into a set of latent features that can be analyzed individually. The study focuses on residue-level representations and examines the relationship between sparse latent activations and biological annotations associated with protein disorder. Statistical analyses and linear probing methods are employed to evaluate whether specific latent features capture signals relevant to intrinsically disordered regions. This work follows an exploratory and interpretability-driven approach. Different model layers and sparse representations are compared in order to better understand how disorder-related information emerges within protein language models. The results provide insights into the structure of learned representations and highlight the potential of sparse autoencoders as a tool for improving interpretability in computational protein analysis. Overall, this thesis contributes to interpretation of representations related to intrinsically disordered proteins bu using different machine learning models and latent features.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Matematica "Tullio Levi-Civita" - DM
			
	Corso di studio
	
				DATA SCIENCE  Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2025
			
	Titolo inglese
	
				Interpreting Disordered Proteins using Sparse Auto-Encoders and Protein Language Models
			
	Abstract in italiano
	
				Protein language models have become powerful tools for representing protein sequences and supporting a wide range of computational biology tasks. Despite their strong performance, the internal representations learned by these models remain difficult to interpret, especially for intrinsically disordered proteins.

This thesis investigates how protein language models encode information related to intrinsically disordered regions and explores whether these representations can be interpreted in a biologically meaningful way. Therefore, sparse autoencoders are applied to intermediate representations extracted from a protein language model. By enforcing sparsity, the autoencoder decomposes high-dimensional model activations into a set of latent features that can be analyzed individually.

The study focuses on residue-level representations and examines the relationship between sparse latent activations and biological annotations associated with protein disorder. Statistical analyses and linear probing methods are employed to evaluate whether specific latent features capture signals relevant to intrinsically disordered regions.

This work follows an exploratory and interpretability-driven approach. Different model layers and sparse representations are compared in order to better understand how disorder-related information emerges within protein language models. The results provide insights into the structure of learned representations and highlight the potential of sparse autoencoders as a tool for improving interpretability in computational protein analysis.

Overall, this thesis contributes to interpretation of representations related to intrinsically disordered proteins bu using different machine learning models and latent features.
			
	Parola chiave
	
				Sparse Autoencoders
Protein LMs
Protein Disorder
Interpretability
Comp Biology
			
	Relatore
	
				PIOVESAN, DAMIANO
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Bacaksiz_Onur_Thesis.pdf accesso aperto Dimensione 2.92 MB Formato Adobe PDF Visualizza/Apri	2.92 MB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/108224