The aim of this research is to build a collection of Hidden Markov Models (HMMs) that can accurately identify intrinsically disordered protein (IDP) domains as defined in the manually curated databases. IDP domains are characterized by the lack of a fixed three dimensional structure and therefore their evolution is not constrained by the need of preserving a well defined topology. This is reflected by a lower sequence conservation compared to globular domains. IDP domains are essential to cellular processes such as alternative splicing, multivalent and transient binding, cellular signalling and regulation. The project focuses on modelling the subset of IDP domains that retain some sequence conservation, as those that bind a partner molecule. The work starts from building MSA of the of IDP domains as provided by the MobiDB database, followed by the creation of the corresponding HMMs. The accuracy of the resulting HMMs is validated using benchmark datasets and compared with the existing HMM databases. The development of HMMs for disordered domains will improve protein sequence annotation accuracy and facilitate a more comprehensive analysis of these regions. This work will contribute to the advancement of techniques for predicting protein disorder and facilitate the study of disordered regions in biological systems. Finally, the HMMs will be integrated into widely-used protein sequence analysis resources like Pfam and InterPro.

The aim of this research is to build a collection of Hidden Markov Models (HMMs) that can accurately identify intrinsically disordered protein (IDP) domains as defined in the manually curated databases. IDP domains are characterized by the lack of a fixed three dimensional structure and therefore their evolution is not constrained by the need of preserving a well defined topology. This is reflected by a lower sequence conservation compared to globular domains. IDP domains are essential to cellular processes such as alternative splicing, multivalent and transient binding, cellular signalling and regulation. The project focuses on modelling the subset of IDP domains that retain some sequence conservation, as those that bind a partner molecule. The work starts from building MSA of the of IDP domains as provided by the MobiDB database, followed by the creation of the corresponding HMMs. The accuracy of the resulting HMMs is validated using benchmark datasets and compared with the existing HMM databases. The development of HMMs for disordered domains will improve protein sequence annotation accuracy and facilitate a more comprehensive analysis of these regions. This work will contribute to the advancement of techniques for predicting protein disorder and facilitate the study of disordered regions in biological systems. Finally, the HMMs will be integrated into widely-used protein sequence analysis resources like Pfam and InterPro.

Hidden Markov Models for detection of intrinsically disordered regions of proteins

SKRYLNIK, ALINA
2023/2024

Abstract

The aim of this research is to build a collection of Hidden Markov Models (HMMs) that can accurately identify intrinsically disordered protein (IDP) domains as defined in the manually curated databases. IDP domains are characterized by the lack of a fixed three dimensional structure and therefore their evolution is not constrained by the need of preserving a well defined topology. This is reflected by a lower sequence conservation compared to globular domains. IDP domains are essential to cellular processes such as alternative splicing, multivalent and transient binding, cellular signalling and regulation. The project focuses on modelling the subset of IDP domains that retain some sequence conservation, as those that bind a partner molecule. The work starts from building MSA of the of IDP domains as provided by the MobiDB database, followed by the creation of the corresponding HMMs. The accuracy of the resulting HMMs is validated using benchmark datasets and compared with the existing HMM databases. The development of HMMs for disordered domains will improve protein sequence annotation accuracy and facilitate a more comprehensive analysis of these regions. This work will contribute to the advancement of techniques for predicting protein disorder and facilitate the study of disordered regions in biological systems. Finally, the HMMs will be integrated into widely-used protein sequence analysis resources like Pfam and InterPro.
2023
Hidden Markov Models for detection of intrinsically disordered regions of proteins
The aim of this research is to build a collection of Hidden Markov Models (HMMs) that can accurately identify intrinsically disordered protein (IDP) domains as defined in the manually curated databases. IDP domains are characterized by the lack of a fixed three dimensional structure and therefore their evolution is not constrained by the need of preserving a well defined topology. This is reflected by a lower sequence conservation compared to globular domains. IDP domains are essential to cellular processes such as alternative splicing, multivalent and transient binding, cellular signalling and regulation. The project focuses on modelling the subset of IDP domains that retain some sequence conservation, as those that bind a partner molecule. The work starts from building MSA of the of IDP domains as provided by the MobiDB database, followed by the creation of the corresponding HMMs. The accuracy of the resulting HMMs is validated using benchmark datasets and compared with the existing HMM databases. The development of HMMs for disordered domains will improve protein sequence annotation accuracy and facilitate a more comprehensive analysis of these regions. This work will contribute to the advancement of techniques for predicting protein disorder and facilitate the study of disordered regions in biological systems. Finally, the HMMs will be integrated into widely-used protein sequence analysis resources like Pfam and InterPro.
keyword1
keyword2
keyword3
File in questo prodotto:
File Dimensione Formato  
SKRYLNIK_ALINA.pdf

accesso aperto

Descrizione: Hidden Markov Models for detection of intrinsically disordered regions of proteins
Dimensione 9.3 MB
Formato Adobe PDF
9.3 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/62028