Ultra-low input single tube linked-read library method enables short-read NGS systems to generate highly accurate and economical long-range sequencing information for de novo genome assembly and haplotype phasing

Zhoutao Chen,Long Pham,Tsai-Chin Wu,Guoya Mo,Yu Xia,Peter Chang,Devin Porter,Tan Phan,Huu Che,Hao Tran,Vikas Bansal,Justin Shaffer,Pedro Belda-Ferre,Greg Humphrey,Rob Knight,Pavel Pevzner,Son Pham,Yong Wang,Ming Lei
DOI: https://doi.org/10.1101/852947
2019-11-29
Abstract:Abstract Long-range sequencing information is required for haplotype phasing, de novo assembly and structural variation detection. Current long-read sequencing technologies can provide valuable long-range information but at a high cost with low accuracy and high DNA input requirement. We have developed a single-tube Transposase Enzyme Linked Long-read Sequencing (TELL-Seq TM ) technology, which enables a low-cost, high-accuracy and high-throughput short-read next generation sequencer to routinely generate over 100 Kb long-range sequencing information with as little as 0.1 ng input material. In a PCR tube, millions of clonally barcoded beads are used to uniquely barcode long DNA molecules in an open bulk reaction without dilution and compartmentation. The barcode linked reads are used to successfully assemble genomes ranging from microbes to human. These linked-reads also generate mega-base-long phased blocks and provide a cost-effective tool for detecting structural variants in a genome, which are important to identify compound heterozygosity in recessive Mendelian diseases and discover genetic drivers and diagnostic biomarkers in cancers.
What problem does this paper attempt to address?