This thesis examines whether common genetic variation associated with autism spectrum condition (ASC) is structured with respect to two evolutionary genomic features: introgression deserts and human accelerated regions (HARs). A genome-wide segmentation framework classified the autosomal genome into five categories based on introgression signal and Neanderthal-specific SNP density, providing a map against which ASC variants were analysed. ASC variants are strongly enriched for derived alleles ($z = +10.19$), with risk alleles more frequently derived than protective alleles (OR $= 1.85$, $p = 4 \times 10^{-4}$). At the gene level, ASC loci show consistent enrichment in strict introgression deserts across all window definitions, alongside significant proximity to HARs. HAR hotspot analysis identifies regulatory clusters near neurodevelopmental transcription factors. A feature-based classification (AUC $= 0.949$) shows that hotspot genes are a distinct subset defined by HAR proximity, derived allele composition, and elevated cerebellar expression across development. These findings indicate that ASC-associated variation is non-randomly distributed across evolutionary genomic features and is concentrated in regulatory regions of constrained neurodevelopmental genes.

This thesis examines whether common genetic variation associated with autism spectrum condition (ASC) is structured with respect to two evolutionary genomic features: introgression deserts and human accelerated regions (HARs). A genome-wide segmentation framework classified the autosomal genome into five categories based on introgression signal and Neanderthal-specific SNP density, providing a map against which ASC variants were analysed. ASC variants are strongly enriched for derived alleles ($z = +10.19$), with risk alleles more frequently derived than protective alleles (OR $= 1.85$, $p = 4 \times 10^{-4}$). At the gene level, ASC loci show consistent enrichment in strict introgression deserts across all window definitions, alongside significant proximity to HARs. HAR hotspot analysis identifies regulatory clusters near neurodevelopmental transcription factors. A feature-based classification (AUC $= 0.949$) shows that hotspot genes are a distinct subset defined by HAR proximity, derived allele composition, and elevated cerebellar expression across development. These findings indicate that ASC-associated variation is non-randomly distributed across evolutionary genomic features and is concentrated in regulatory regions of constrained neurodevelopmental genes.

An evolutionary analysis of Autism risk genes

SOLTANTOUYEH, ANAHITA
2025/2026

Abstract

This thesis examines whether common genetic variation associated with autism spectrum condition (ASC) is structured with respect to two evolutionary genomic features: introgression deserts and human accelerated regions (HARs). A genome-wide segmentation framework classified the autosomal genome into five categories based on introgression signal and Neanderthal-specific SNP density, providing a map against which ASC variants were analysed. ASC variants are strongly enriched for derived alleles ($z = +10.19$), with risk alleles more frequently derived than protective alleles (OR $= 1.85$, $p = 4 \times 10^{-4}$). At the gene level, ASC loci show consistent enrichment in strict introgression deserts across all window definitions, alongside significant proximity to HARs. HAR hotspot analysis identifies regulatory clusters near neurodevelopmental transcription factors. A feature-based classification (AUC $= 0.949$) shows that hotspot genes are a distinct subset defined by HAR proximity, derived allele composition, and elevated cerebellar expression across development. These findings indicate that ASC-associated variation is non-randomly distributed across evolutionary genomic features and is concentrated in regulatory regions of constrained neurodevelopmental genes.
2025
An evolutionary analysis of Autism risk genes
This thesis examines whether common genetic variation associated with autism spectrum condition (ASC) is structured with respect to two evolutionary genomic features: introgression deserts and human accelerated regions (HARs). A genome-wide segmentation framework classified the autosomal genome into five categories based on introgression signal and Neanderthal-specific SNP density, providing a map against which ASC variants were analysed. ASC variants are strongly enriched for derived alleles ($z = +10.19$), with risk alleles more frequently derived than protective alleles (OR $= 1.85$, $p = 4 \times 10^{-4}$). At the gene level, ASC loci show consistent enrichment in strict introgression deserts across all window definitions, alongside significant proximity to HARs. HAR hotspot analysis identifies regulatory clusters near neurodevelopmental transcription factors. A feature-based classification (AUC $= 0.949$) shows that hotspot genes are a distinct subset defined by HAR proximity, derived allele composition, and elevated cerebellar expression across development. These findings indicate that ASC-associated variation is non-randomly distributed across evolutionary genomic features and is concentrated in regulatory regions of constrained neurodevelopmental genes.
Autism
Evolution
Neanderthal desert
File in questo prodotto:
File Dimensione Formato  
Data_Science_MsC_Thesis____UniPD-4.pdf

Accesso riservato

Dimensione 1.57 MB
Formato Adobe PDF
1.57 MB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/108241