Genes that are crucial for the function of an organism are depleted of disrupting variants in natural populations, whereas non-essential genes tolerate their accumulation. Next-generation sequencing (NGS) of the general population has enabled comprehensive coverage of the human genome, identifying single nucleotide variants (SNVs) at an impressive density of two SNVs per three base pairs. This has demonstrated that regions intolerant to variations are important for gene function and usually map to structural domains. However, the presence and role of regions intolerant to variations in non-globular domains, such as intrinsically disordered regions (IDRs), remain to be investigated. The aim of this study is to determine the distribution of the Missense Intolerance Ratio (MIR), a measure of regional intolerance to missense variation, in intrinsically disordered proteins (IDPs) and to explore how these regions relate to protein functions. We analyzed the content of missense intolerant regions (MIRs) in a set of human proteins retrieved from DisProt, the major manually curated repository of IDPs. The matched MIRs were then correlated with the presence of IDP features retrieved from MobiDB, a resource that integrates predictions and functional annotations of protein disorder and mobility. Additionally, the matched MIRs were analyzed for the presence of disease variants reported in the ClinVar database, which collects variants associated with human diseases. Our results indicate that while MIRs are enriched in protein domains, a substantial proportion is also present within IDRs. Although no significant correlation was found between MIRs and other protein features, MIRs were frequently associated with disease-related variants. These findings highlight the functional importance of MIRs in both ordered and disordered protein regions. However, limitations in dataset coverage and methodological assumptions necessitate further investigation to fully elucidate the role of MIRs in IDPs.
Genes that are crucial for the function of an organism are depleted of disrupting variants in natural populations, whereas non-essential genes tolerate their accumulation. Next-generation sequencing (NGS) of the general population has enabled comprehensive coverage of the human genome, identifying single nucleotide variants (SNVs) at an impressive density of two SNVs per three base pairs. This has demonstrated that regions intolerant to variations are important for gene function and usually map to structural domains. However, the presence and role of regions intolerant to variations in non-globular domains, such as intrinsically disordered regions (IDRs), remain to be investigated. The aim of this study is to determine the distribution of the Missense Intolerance Ratio (MIR), a measure of regional intolerance to missense variation, in intrinsically disordered proteins (IDPs) and to explore how these regions relate to protein functions. We analyzed the content of missense intolerant regions (MIRs) in a set of human proteins retrieved from DisProt, the major manually curated repository of IDPs. The matched MIRs were then correlated with the presence of IDP features retrieved from MobiDB, a resource that integrates predictions and functional annotations of protein disorder and mobility. Additionally, the matched MIRs were analyzed for the presence of disease variants reported in the ClinVar database, which collects variants associated with human diseases. Our results indicate that while MIRs are enriched in protein domains, a substantial proportion is also present within IDRs. Although no significant correlation was found between MIRs and other protein features, MIRs were frequently associated with disease-related variants. These findings highlight the functional importance of MIRs in both ordered and disordered protein regions. However, limitations in dataset coverage and methodological assumptions necessitate further investigation to fully elucidate the role of MIRs in IDPs.
Assessment of Missense Intolerant Regions (MIRs) in Intrinsically Disordered Proteins (IDPs)
RASOULI, SINA
2023/2024
Abstract
Genes that are crucial for the function of an organism are depleted of disrupting variants in natural populations, whereas non-essential genes tolerate their accumulation. Next-generation sequencing (NGS) of the general population has enabled comprehensive coverage of the human genome, identifying single nucleotide variants (SNVs) at an impressive density of two SNVs per three base pairs. This has demonstrated that regions intolerant to variations are important for gene function and usually map to structural domains. However, the presence and role of regions intolerant to variations in non-globular domains, such as intrinsically disordered regions (IDRs), remain to be investigated. The aim of this study is to determine the distribution of the Missense Intolerance Ratio (MIR), a measure of regional intolerance to missense variation, in intrinsically disordered proteins (IDPs) and to explore how these regions relate to protein functions. We analyzed the content of missense intolerant regions (MIRs) in a set of human proteins retrieved from DisProt, the major manually curated repository of IDPs. The matched MIRs were then correlated with the presence of IDP features retrieved from MobiDB, a resource that integrates predictions and functional annotations of protein disorder and mobility. Additionally, the matched MIRs were analyzed for the presence of disease variants reported in the ClinVar database, which collects variants associated with human diseases. Our results indicate that while MIRs are enriched in protein domains, a substantial proportion is also present within IDRs. Although no significant correlation was found between MIRs and other protein features, MIRs were frequently associated with disease-related variants. These findings highlight the functional importance of MIRs in both ordered and disordered protein regions. However, limitations in dataset coverage and methodological assumptions necessitate further investigation to fully elucidate the role of MIRs in IDPs.File | Dimensione | Formato | |
---|---|---|---|
Rasouli_Sina_pdfA.pdf
accesso aperto
Dimensione
1.97 MB
Formato
Adobe PDF
|
1.97 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/71033