High-fidelity (repeat) consensus sequences from short reads using combined read clustering and assembly

Ludwig Mann,Kristin Balasch,Nicola Schmidt,Tony Heitkam
DOI: https://doi.org/10.1186/s12864-023-09948-4
IF: 4.547
2024-01-26
BMC Genomics
Abstract:Despite the many cheap and fast ways to generate genomic data, good and exact genome assembly is still a problem, with especially the repeats being vastly underrepresented and often misassembled. As short reads in low coverage are already sufficient to represent the repeat landscape of any given genome, many read cluster algorithms were brought forward that provide repeat identification and classification. But how can trustworthy, reliable and representative repeat consensuses be derived from unassembled genomes?
genetics & heredity,biotechnology & applied microbiology
What problem does this paper attempt to address?