DMcompress: Dynamic Markov Models for Bacterial Genome Compression

Rongjie Wang,Mingxiang Teng,Yang Bai,Tianyi Zang,Yadong Wang
DOI: https://doi.org/10.1109/bibm.2016.7822621
2016-01-01
Abstract:Genome data increasing exponentially since the last decade, compressing genome with Markov models has been proposed as an effective statistical method. However, existing methods set a static order-k Markov models to compress various genomes. Employing static order-k Markov model could result in a sub-optimal orders on some genomes. In this paper, we propose a compression method that relies on a pre-analysis of the data before compression, with the aim of estimating Markov models order k, yielding improvements over static Markov models. Experimental results on the latest complete bacterial genome data show that our method could effectively compress genome with a better performance than the state-of-the-art method. The codes of DMcompress are available at https://rongjiewang.github.io/DMcompress.
What problem does this paper attempt to address?