Accurate Decoding of Pooled Sequenced Data Using Compressed Sensing

Denisa Duma,Mary Wootters,Anna C. Gilbert,Hung Q. Ngo,Atri Rudra,Matthew Alpert,Timothy J. Close,Gianfranco Ciardo,Stefano Lonardi
DOI: https://doi.org/10.48550/arXiv.1307.7810
2013-07-30
Abstract:In order to overcome the limitations imposed by DNA barcoding when multiplexing a large number of samples in the current generation of high-throughput sequencing instruments, we have recently proposed a new protocol that leverages advances in combinatorial pooling design (group testing) doi:<a class="link-https link-external" data-doi="10.1371/journal.pcbi.1003010" href="https://doi.org/10.1371/journal.pcbi.1003010" rel="external noopener nofollow">https://doi.org/10.1371/journal.pcbi.1003010</a>. We have also demonstrated how this new protocol would enable de novo selective sequencing and assembly of large, highly-repetitive genomes. Here we address the problem of decoding pooled sequenced data obtained from such a protocol. Our algorithm employs a synergistic combination of ideas from compressed sensing and the decoding of error-correcting codes. Experimental results on synthetic data for the rice genome and real data for the barley genome show that our novel decoding algorithm enables significantly higher quality assemblies than the previous approach.
Quantitative Methods,Computational Engineering, Finance, and Science,Information Theory,Genomics
What problem does this paper attempt to address?