The topological analysis of biological pathways represents a central component in the study of genetic and molecular mechanisms. In particular, representing pathways in a graph-based format enables automatic analyses ranging from high-level network science approaches to bioinformatics-specific analyses. However, the conversion of signaling pathways into networks is a non-trivial task, due both to the high complexity of the biological reactions, represented as network edges, and to the strong heterogeneity of the entities involved, corresponding to network nodes. In biological pathways, chemical molecules, genes, gene families, protein complexes, and interactions of different types coexist; therefore, their correct representation is crucial to obtain reliable analyses that are not affected by redundancy or information loss. In recent years, several computational tools have been developed to perform this conversion. However, these approaches present potential limitations, including information loss during the conversion process and the inclusion of entities that may not be directly usable in bioinformatics analyses based on sequencing data. In this work, particular attention is devoted to the challenges associated with the use of Reactome pathways, one of the most comprehensive and up-to-date databases currently available. The objective of this thesis is the development of an automatic framework for the extraction and modeling of information contained in Reactome pathways, starting from the standard BioPAX Level 3 format, to obtain gene–gene networks. The proposed approach enables the construction of gene-based graph representations of pathways, centered on entities measurable by sequencing experiments (i.e., genes), while preserving pathway topology, biologically meaningful relationships, and relevant biological annotations of both nodes and edges.

The topological analysis of biological pathways represents a central component in the study of genetic and molecular mechanisms. In particular, representing pathways in a graph-based format enables automatic analyses ranging from high-level network science approaches to bioinformatics-specific analyses. However, the conversion of signaling pathways into networks is a non-trivial task, due both to the high complexity of the biological reactions, represented as network edges, and to the strong heterogeneity of the entities involved, corresponding to network nodes. In biological pathways, chemical molecules, genes, gene families, protein complexes, and interactions of different types coexist; therefore, their correct representation is crucial to obtain reliable analyses that are not affected by redundancy or information loss. In recent years, several computational tools have been developed to perform this conversion. However, these approaches present potential limitations, including information loss during the conversion process and the inclusion of entities that may not be directly usable in bioinformatics analyses based on sequencing data. In this work, particular attention is devoted to the challenges associated with the use of Reactome pathways, one of the most comprehensive and up-to-date databases currently available. The objective of this thesis is the development of an automatic framework for the extraction and modeling of information contained in Reactome pathways, starting from the standard BioPAX Level 3 format, to obtain gene–gene networks. The proposed approach enables the construction of gene-based graph representations of pathways, centered on entities measurable by sequencing experiments (i.e., genes), while preserving pathway topology, biologically meaningful relationships, and relevant biological annotations of both nodes and edges.

An automated framework for converting Reactome signaling pathways into bioinformatics analysis-ready gene–gene networks

SIGNOR, VALENTINA
2025/2026

Abstract

The topological analysis of biological pathways represents a central component in the study of genetic and molecular mechanisms. In particular, representing pathways in a graph-based format enables automatic analyses ranging from high-level network science approaches to bioinformatics-specific analyses. However, the conversion of signaling pathways into networks is a non-trivial task, due both to the high complexity of the biological reactions, represented as network edges, and to the strong heterogeneity of the entities involved, corresponding to network nodes. In biological pathways, chemical molecules, genes, gene families, protein complexes, and interactions of different types coexist; therefore, their correct representation is crucial to obtain reliable analyses that are not affected by redundancy or information loss. In recent years, several computational tools have been developed to perform this conversion. However, these approaches present potential limitations, including information loss during the conversion process and the inclusion of entities that may not be directly usable in bioinformatics analyses based on sequencing data. In this work, particular attention is devoted to the challenges associated with the use of Reactome pathways, one of the most comprehensive and up-to-date databases currently available. The objective of this thesis is the development of an automatic framework for the extraction and modeling of information contained in Reactome pathways, starting from the standard BioPAX Level 3 format, to obtain gene–gene networks. The proposed approach enables the construction of gene-based graph representations of pathways, centered on entities measurable by sequencing experiments (i.e., genes), while preserving pathway topology, biologically meaningful relationships, and relevant biological annotations of both nodes and edges.
2025
An automated framework for converting Reactome signaling pathways into bioinformatics analysis-ready gene–gene networks
The topological analysis of biological pathways represents a central component in the study of genetic and molecular mechanisms. In particular, representing pathways in a graph-based format enables automatic analyses ranging from high-level network science approaches to bioinformatics-specific analyses. However, the conversion of signaling pathways into networks is a non-trivial task, due both to the high complexity of the biological reactions, represented as network edges, and to the strong heterogeneity of the entities involved, corresponding to network nodes. In biological pathways, chemical molecules, genes, gene families, protein complexes, and interactions of different types coexist; therefore, their correct representation is crucial to obtain reliable analyses that are not affected by redundancy or information loss. In recent years, several computational tools have been developed to perform this conversion. However, these approaches present potential limitations, including information loss during the conversion process and the inclusion of entities that may not be directly usable in bioinformatics analyses based on sequencing data. In this work, particular attention is devoted to the challenges associated with the use of Reactome pathways, one of the most comprehensive and up-to-date databases currently available. The objective of this thesis is the development of an automatic framework for the extraction and modeling of information contained in Reactome pathways, starting from the standard BioPAX Level 3 format, to obtain gene–gene networks. The proposed approach enables the construction of gene-based graph representations of pathways, centered on entities measurable by sequencing experiments (i.e., genes), while preserving pathway topology, biologically meaningful relationships, and relevant biological annotations of both nodes and edges.
bioinformatics
network modelling
gene network
signaling pathway
File in questo prodotto:
File Dimensione Formato  
Signor_Valentina.pdf

accesso aperto

Dimensione 22.05 MB
Formato Adobe PDF
22.05 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/108239