HiSV: A control-free method for structural variation detection from Hi-C data

Junping Li,Lin Gao,Yusen Ye
DOI: https://doi.org/10.1371/journal.pcbi.1010760
2023-01-07
PLoS Computational Biology
Abstract:Structural variations (SVs) play an essential role in the evolution of human genomes and are associated with cancer genetics and rare disease. High-throughput chromosome capture (Hi-C) technology probed all genome-wide crosslinked chromatin to study the spatial architecture of chromosomes. Hi-C read pairs can span megabases, making the technology useful for detecting large-scale SVs. So far, the identification of SVs from Hi-C data is still in the early stages with only a few methods available. Especially, no algorithm has been developed that can detect SVs without control samples. Therefore, we developed HiSV ( Hi -C for S tructural V ariation), a control-free method for identifying large-scale SVs from a Hi-C sample. Inspired by the single image saliency detection model, HiSV constructed a saliency map of interaction frequencies and extracted saliency segments as large-scale SVs. By evaluating both simulated and real data, HiSV not only detected all variant types, but also achieved a higher level of accuracy and sensitivity than existing methods. Moreover, our results on cancer cell lines showed that HiSV effectively detected eight complex SV events and identified two novel SVs of key factors associated with cancer development. Finally, we found that integrating the result of HiSV helped the WGS method to identify a total number of 94 novel SVs in two cancer cell lines. Cancer and rare diseases are often driven by structural variations (SVs). Despite their importance, detecting SV events remains challenging. High-throughput chromosome capture (Hi-C) technology has proven valuable for large-scale SV detection. However, algorithms that can use Hi-C data without control samples for SV detection have been severely lacking. Therefore, we presented HiSV ( Hi -C for S tructural V ariation), a control-free method for identifying large-scale SVs from a Hi-C sample. We evaluated HiSV's performance on the simulation datasets and cancer cell lines, HiSV achieved superior accuracy and sensitivity. Moreover, HiSV effectively captured complex SVs in cancer cell lines. Finally, we demonstrated that HiSV can be applied to supplement the result of WGS methods.
biochemical research methods,mathematical & computational biology
What problem does this paper attempt to address?