Identification of medium-sized genomic deletions with low coverage, mate-paired restricted tags

Qiang Gong,Yong Tao,Jian-Rong Yang,Jun Cai,Yunfei Yuan,Jue Ruan,Jin Yang,Hailiang Liu,Wanghua Li,Xuemei Lu,Shi-Mei Zhuang,San Ming Wang,Chung-I Wu
DOI: https://doi.org/10.1186/1471-2164-14-51
IF: 4.547
2013-01-01
BMC Genomics
Abstract:Background Genomic deletions are known to be widespread in many species. Variant sequencing-based approaches for identifying deletions have been developed, but their powers to detect those deletions that affect medium-sized regions are limited when the sequencing coverage is low. Results We present a cost-effective method for identifying medium-sized deletions in genomic regions with low genomic coverage. Two mate-paired libraries were separately constructed from human cancerous tissue to generate paired short reads (ditags) from restriction fragments digested with a 4-base restriction enzyme. A total of 3 Gb of paired reads (1.0× genome size) was collected, and 175 deletions were inferred by identifying the ditags with disorder alignments to the reference genome sequence. Sanger sequencing results confirmed an overall detection accuracy of 95%. Good reproducibility was verified by the deletions that were detected by both libraries. Conclusions We provide an approach to accurately identify medium-sized deletions in large genomes with low sequence coverage. It can be applied in studies of comparative genomics and in the identification of germline and somatic variants.
What problem does this paper attempt to address?