Long-range somatic structural variation calling from matched tumor-normal co-assembly graphs

Megan K. Le,Qian Qin,Heng Li
DOI: https://doi.org/10.1101/2024.07.29.605160
2024-07-30
Abstract:The accurate identification of somatic structural variants (SVs) is a problem with significant applications to clinical cancer research. Though technologies such as long-read sequencing have facilitated the development of more accurate SV calling methods, existing somatic SV callers still struggle with achieving high precision. In this work, we present colorSV, a long-read-based method for calling long-range SVs by examining the local topology of joint assembly graphs from matched tumor-normal samples. colorSV is the first somatic SV calling method that uses a co-assembly approach, as well as the first SV caller that identifies variants by examining characteristics of the assembly graph itself. We demonstrate near-perfect precision and sensitivity for calling translocations on the COLO829 cell line, outperforming four existing somatic SV callers (Severus, Sniffles2, nanomonsv, and SAVANA) in both metrics. We also evaluated colorSV for calling translocations on the HCC1395 cell line, finding that our method achieved a good balance between sensitivity and precision (where the sensitivity was only outperformed by Severus, and the precision was only outperformed by nanomonsv). Our work establishes a novel joint assembly-based strategy for characterizing long-range somatic variation, which could be further expanded or modified for the identification of SVs of different types and sizes.
Biology
What problem does this paper attempt to address?