Phasing Diploid Genome Assembly Graphs with Single-Cell Strand Sequencing

Mir Henglin,Maryam Ghareghani,William Harvey,David Porubsky,Sergey Koren,Evan E Eichler,Peter Ebert,Tobias Marschall
DOI: https://doi.org/10.1101/2024.02.15.580432
2024-06-20
Abstract:Haplotype information is crucial for biomedical and population genetics research. However, current strategies to produce de-novo haplotype-resolved assemblies often require either difficult-to-acquire parental data or an intermediate haplotype-collapsed assembly. Here, we present Graphasing, a workflow which synthesizes the global phase signal of Strand-seq with assembly graph topology to produce chromosome-scale de-novo haplotypes for diploid genomes. Graphasing readily integrates with any assembly workflow that both outputs an assembly graph and has a haplotype assembly mode. Graphasing performs comparably to trio-phasing in contiguity, phasing accuracy, and assembly quality, outperforms Hi-C in phasing accuracy, and generates human assemblies with over 18 chromosome-spanning haplotypes.
Bioinformatics
What problem does this paper attempt to address?