The chromosomes are organized structures containing all the genetic information of a cell. Despite their importance, discovering high order chromosome structural features is still complicated due to technical limitations and it is impossible to measure them directly. This thesis deals with the question of how to interpret geometrically the Chromosome Configuration Capture (3C) and the high resolution (Hi-C) data. Essentially this data yields only topological information in the sense that it provides contact frequencies between different parts or monomers of the chromosome strand with itself. The goal of this thesis is to develop algorithms in order to obtain from these data a three dimensional reconstruction of the chromosomes configuration. We propose to interpret the Hi-C matrix data as the adjacency matrix of a complete undirected weighted graph. Therefore, we define various possible distance measures between the vertexes of a graph in such a way to construct a distance matrix and consequently applying the Multidimensional scaling method (MDS) to compute the coordinates of each monomer. We investigate the effects of these different distances on the MDS and on the possible dimensions of the embedding space. Simulating a linear, circular and rosetted polymer, we demonstrate that the so-called resistance distance yields to the most consistent three dimensional reconstruction. Finally, we incorporate further knowledge into the embedding. In fact, fluorescence measurements provide data on some specific loci and their distances using light microscopy. These data are incorporated by means of the Laplacian matrix of the graph and the definition of a partition function. In this way we can modify the input matrix data and re-apply the MDS. The obtained structures are smoothly changing the structure of the polymers and are partially consistent with the expected results.

Interpretation of Chromosome Configuration Capture data in terms of graphs

Abbate, Matteo
2016/2017

Abstract

The chromosomes are organized structures containing all the genetic information of a cell. Despite their importance, discovering high order chromosome structural features is still complicated due to technical limitations and it is impossible to measure them directly. This thesis deals with the question of how to interpret geometrically the Chromosome Configuration Capture (3C) and the high resolution (Hi-C) data. Essentially this data yields only topological information in the sense that it provides contact frequencies between different parts or monomers of the chromosome strand with itself. The goal of this thesis is to develop algorithms in order to obtain from these data a three dimensional reconstruction of the chromosomes configuration. We propose to interpret the Hi-C matrix data as the adjacency matrix of a complete undirected weighted graph. Therefore, we define various possible distance measures between the vertexes of a graph in such a way to construct a distance matrix and consequently applying the Multidimensional scaling method (MDS) to compute the coordinates of each monomer. We investigate the effects of these different distances on the MDS and on the possible dimensions of the embedding space. Simulating a linear, circular and rosetted polymer, we demonstrate that the so-called resistance distance yields to the most consistent three dimensional reconstruction. Finally, we incorporate further knowledge into the embedding. In fact, fluorescence measurements provide data on some specific loci and their distances using light microscopy. These data are incorporated by means of the Laplacian matrix of the graph and the definition of a partition function. In this way we can modify the input matrix data and re-apply the MDS. The obtained structures are smoothly changing the structure of the polymers and are partially consistent with the expected results.
2016-09
92
Hi-C - embedding - graph - structure - distances - multidimensional - scaling - reconstruction - laplacian
File in questo prodotto:
File Dimensione Formato  
Tesi_LM_Abbate.pdf

accesso aperto

Dimensione 3.32 MB
Formato Adobe PDF
3.32 MB Adobe PDF Visualizza/Apri

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/28436