Computational Prediction of Clinical Phenotypes and Causal Variants in Neurodevelopmental Disorders: An Analysis of Genetic Variants Data

Advancements in genomic sequencing technologies and bioinformatics tools have allowed the understanding of the genetic variations in organisms and their potential effects on phenotypes (traits). This paper presents an approach, employing machine learning models for the automated prediction of patient phenotypes and the identification of genetic variants, specifically focusing on causative, likely causative and contributing variants in neurodevelopmental disorders. It also observes the connection between patient phenotypes and variants, categorizing individuals based on neurodevelopmental manifestations such as intellectual disability, autism, epilepsy, microcephaly, macrocephaly, hypotonia, and ataxia. The study utilized genetic variant data from 565 patients, building upon & combining the previous works in the field. Unlike manual variant filtering and classification, as commonly used in the field for similar purposes, the aim was to contribute to the development of an automated tool. This tool streamlines the variant classification process and enhances disease classification accuracy with a systematic and data-driven approach to variant interpretation. To validate the approach, the results were shared and compared with those of previous groups that participated in the CAGI Challenge (Critical Assessment of Genome Interpretation) in 2018 and 2021, addressing the same task. This analysis provides insights into the performance of the tool in comparison to manual approaches.

Computational Prediction of Clinical Phenotypes and Causal Variants in Neurodevelopmental Disorders: An Analysis of Genetic Variants Data

CAMUZ, CAN ABDULLAH

2023/2024

Abstract

Advancements in genomic sequencing technologies and bioinformatics tools have allowed the understanding of the genetic variations in organisms and their potential effects on phenotypes (traits). This paper presents an approach, employing machine learning models for the automated prediction of patient phenotypes and the identification of genetic variants, specifically focusing on causative, likely causative and contributing variants in neurodevelopmental disorders. It also observes the connection between patient phenotypes and variants, categorizing individuals based on neurodevelopmental manifestations such as intellectual disability, autism, epilepsy, microcephaly, macrocephaly, hypotonia, and ataxia. The study utilized genetic variant data from 565 patients, building upon & combining the previous works in the field. Unlike manual variant filtering and classification, as commonly used in the field for similar purposes, the aim was to contribute to the development of an automated tool. This tool streamlines the variant classification process and enhances disease classification accuracy with a systematic and data-driven approach to variant interpretation. To validate the approach, the results were shared and compared with those of previous groups that participated in the CAGI Challenge (Critical Assessment of Genome Interpretation) in 2018 and 2021, addressing the same task. This analysis provides insights into the performance of the tool in comparison to manual approaches.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				Dipartimento di Matematica "Tullio Levi-Civita" - DM
			
	Corso di studio
	
				DATA SCIENCE Laurea Magistrale (D.M. 270/2004)
			
	Anno Accademico
	
				2023
			
	Titolo inglese
	
				Computational Prediction of Clinical Phenotypes and Causal Variants in Neurodevelopmental Disorders: An Analysis of Genetic Variants Data
			
	Abstract in italiano
	
				Advancements in genomic sequencing technologies and bioinformatics tools have allowed the understanding of the genetic variations in organisms and their potential effects on phenotypes (traits). This paper presents an approach, employing machine learning models for the automated prediction of patient phenotypes and the identification of genetic variants, specifically focusing on causative, likely causative and contributing variants in neurodevelopmental disorders. It also observes the connection between patient phenotypes and variants, categorizing individuals based on neurodevelopmental manifestations such as intellectual disability, autism, epilepsy, microcephaly, macrocephaly, hypotonia, and ataxia. The study utilized genetic variant data from 565 patients, building upon & combining the previous works in the field. Unlike manual variant filtering and classification, as commonly used in the field for similar purposes, the aim was to contribute to the development of an automated tool. This tool streamlines the variant classification process and enhances disease classification accuracy with a systematic and data-driven approach to variant interpretation. To validate the approach, the results were shared and compared with those of previous groups that participated in the CAGI Challenge (Critical Assessment of Genome Interpretation) in 2018 and 2021, addressing the same task. This analysis provides insights into the performance of the tool in comparison to manual approaches.
			
	Parola chiave
	
				Genetic variants
Phenotype prediction
Machine learning
CAGI
Gene panel sequence
			
	Relatore
	
				LEONARDI, EMANUELA
			
	Appare nelle tipologie:
	
				Lauree magistrali

File in questo prodotto:

File	Dimensione	Formato
Thesis_Can_Abdullah_Camuz.pdf accesso aperto Dimensione 4.38 MB Formato Adobe PDF Visualizza/Apri	4.38 MB	Adobe PDF	Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/62024