The chromosomes are organized structures containing all the genetic information of a cell. Despite their importance, discovering high order chromosome structural features is still complicated due to technical limitations and it is impossible to measure them directly. This thesis deals with the question of how to interpret geometrically the Chromosome Configuration Capture (3C) and the high resolution (HiC) data. Essentially this data yields only topological information in the sense that it provides contact frequencies between different parts or monomers of the chromosome strand with itself. The goal of this thesis is to develop algorithms in order to obtain from these data a three dimensional reconstruction of the chromosomes configuration. We propose to interpret the HiC matrix data as the adjacency matrix of a complete undirected weighted graph. Therefore, we define various possible distance measures between the vertexes of a graph in such a way to construct a distance matrix and consequently applying the Multidimensional scaling method (MDS) to compute the coordinates of each monomer. We investigate the effects of these different distances on the MDS and on the possible dimensions of the embedding space. Simulating a linear, circular and rosetted polymer, we demonstrate that the socalled resistance distance yields to the most consistent three dimensional reconstruction. Finally, we incorporate further knowledge into the embedding. In fact, fluorescence measurements provide data on some specific loci and their distances using light microscopy. These data are incorporated by means of the Laplacian matrix of the graph and the definition of a partition function. In this way we can modify the input matrix data and reapply the MDS. The obtained structures are smoothly changing the structure of the polymers and are partially consistent with the expected results.
Interpretation of Chromosome Configuration Capture data in terms of graphs
Abbate, Matteo
2016/2017
Abstract
The chromosomes are organized structures containing all the genetic information of a cell. Despite their importance, discovering high order chromosome structural features is still complicated due to technical limitations and it is impossible to measure them directly. This thesis deals with the question of how to interpret geometrically the Chromosome Configuration Capture (3C) and the high resolution (HiC) data. Essentially this data yields only topological information in the sense that it provides contact frequencies between different parts or monomers of the chromosome strand with itself. The goal of this thesis is to develop algorithms in order to obtain from these data a three dimensional reconstruction of the chromosomes configuration. We propose to interpret the HiC matrix data as the adjacency matrix of a complete undirected weighted graph. Therefore, we define various possible distance measures between the vertexes of a graph in such a way to construct a distance matrix and consequently applying the Multidimensional scaling method (MDS) to compute the coordinates of each monomer. We investigate the effects of these different distances on the MDS and on the possible dimensions of the embedding space. Simulating a linear, circular and rosetted polymer, we demonstrate that the socalled resistance distance yields to the most consistent three dimensional reconstruction. Finally, we incorporate further knowledge into the embedding. In fact, fluorescence measurements provide data on some specific loci and their distances using light microscopy. These data are incorporated by means of the Laplacian matrix of the graph and the definition of a partition function. In this way we can modify the input matrix data and reapply the MDS. The obtained structures are smoothly changing the structure of the polymers and are partially consistent with the expected results.File  Dimensione  Formato  

Tesi_LM_Abbate.pdf
accesso aperto
Dimensione
3.32 MB
Formato
Adobe PDF

3.32 MB  Adobe PDF  Visualizza/Apri 
The text of this website © Università degli studi di Padova. Full Text are published under a nonexclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/28436