Lctd: A Lossless Compression Tool of Fastq File Based on Transformation of Original File Distribution

Jiabing Fu,Yacong Ma,Bixin Ke,Shoubin Dong
DOI: https://doi.org/10.1109/bibm.2016.7822639
2016-01-01
Abstract:In this paper, we propose a non-reference based and lossless compression tool of FASTQ which is commonly used to store the NGS. Instead of elaborating excellent data structure and compression technique based on the original FASTQ file, we try to change the distribution of original FASTQ file so as to make it better for further compression by existing compression tools. Experimental results indicate that our method outperforms all the six state-of-the-art compression tools and achieves up to 10% ∼ 43% improvement in terms of the average compression ratio. Besides, our compression tool LCTD outperforms Fastqz in both compression ratio and speed and the latter compression tool Fastqz wins the world champion of compression competition SequenceSqueeze. The source program is available by sending email to us.
What problem does this paper attempt to address?