With this work we want to find an efficient way to compress k-mers sets with counters since they take up a lot of disk space but their use brings several advantages over genomes or sets of genomes. Here some strategies are pro- posed to explore the cdBGs in order to produce smaller files than UST and the counts encoding has been revised. A new application has been presented to implement the above strategies and fix a bug in UST which caused wrong counts ordering. It has been shown that it is possible to improve the com- pression with respect to UST based on the density of the graph. Finally, a small value of k leads to denser graphs and therefore better results.
With this work we want to find an efficient way to compress k-mers sets with counters since they take up a lot of disk space but their use brings several advantages over genomes or sets of genomes. Here some strategies are pro- posed to explore the cdBGs in order to produce smaller files than UST and the counts encoding has been revised. A new application has been presented to implement the above strategies and fix a bug in UST which caused wrong counts ordering. It has been shown that it is possible to improve the com- pression with respect to UST based on the density of the graph. Finally, a small value of k leads to denser graphs and therefore better results.
Methods for compressing k-mers set with counters
ROSSIGNOLO, ENRICO
2022/2023
Abstract
With this work we want to find an efficient way to compress k-mers sets with counters since they take up a lot of disk space but their use brings several advantages over genomes or sets of genomes. Here some strategies are pro- posed to explore the cdBGs in order to produce smaller files than UST and the counts encoding has been revised. A new application has been presented to implement the above strategies and fix a bug in UST which caused wrong counts ordering. It has been shown that it is possible to improve the com- pression with respect to UST based on the density of the graph. Finally, a small value of k leads to denser graphs and therefore better results.File | Dimensione | Formato | |
---|---|---|---|
Rossignolo_Enrico.pdf
accesso aperto
Dimensione
4.08 MB
Formato
Adobe PDF
|
4.08 MB | Adobe PDF | Visualizza/Apri |
The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License
https://hdl.handle.net/20.500.12608/45148