This work explores the ability of embedding models of representing essential structural and biological features of proteins, specifically focusing on Linear Interacting Peptides (LIPs), a class of binding disordered regions recently introduced in the MobiDB database. The aim of this work was to test a LIPs detection and classification method using learning models and starting from protein sequences. The first part of this work is centered on the creation of a comprehensive target dataset for training the model, built upon the outputs from FLIPPER. This data was then further filtered by combining both disorder and binding information. A second part is dedicated to the training of a CNN model for LIPs discrimination.
This work explores the ability of embedding models of representing essential structural and biological features of proteins, specifically focusing on Linear Interacting Peptides (LIPs), a class of binding disordered regions recently introduced in the MobiDB database. The aim of this work was to test a LIPs detection and classification method using learning models and starting from protein sequences. The first part of this work is centered on the creation of a comprehensive target dataset for training the model, built upon the outputs from FLIPPER. This data was then further filtered by combining both disorder and binding information. A second part is dedicated to the training of a CNN model for LIPs discrimination.
Linear Interacting Peptides (LIPs) detection in protein sequences based on embedding models
CARANGELO, RICCARDO
2023/2024
Abstract
This work explores the ability of embedding models of representing essential structural and biological features of proteins, specifically focusing on Linear Interacting Peptides (LIPs), a class of binding disordered regions recently introduced in the MobiDB database. The aim of this work was to test a LIPs detection and classification method using learning models and starting from protein sequences. The first part of this work is centered on the creation of a comprehensive target dataset for training the model, built upon the outputs from FLIPPER. This data was then further filtered by combining both disorder and binding information. A second part is dedicated to the training of a CNN model for LIPs discrimination.File | Dimensione | Formato | |
---|---|---|---|
Carangelo_Riccardo.pdf
accesso aperto
Dimensione
15.86 MB
Formato
Adobe PDF
|
15.86 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/64785