A Novel Approach to Detect Large Indels from Targeted Sequencing Data in Clinical Cancer Setting

Lijuan Chen,Caiping Chen,Dongling Chen,Qiuyi Zou,Gungwei Qin,Hui Kang,Ming Yao,Kai Wang
DOI: https://doi.org/10.1200/jco.2017.35.15_suppl.e13002
IF: 45.3
2017-01-01
Journal of Clinical Oncology
Abstract:e13002 Background: Next generation sequencing (NGS) technologies have already shown numerous advances to revolutionize our understanding of cancer genomic profiling and improve cancer treatments. There have been many NGS data analysis tools available for identification of different genomic alternations including short insertion and deletion (short indel, < 25 bp in general). However, detection of > 100 bp large indel (L-indel) from short reads (generally < 200 bp) remains a huge challenge. L-indels identified in genes like MET and FLT3 have proven a critical implication in cancer treatments. Moreover, there is an urgent need for an algorithm to validate L-indels generated by genomic modification systems like ZFN, TALENs and CRISPR/Cas9. Methods: A novel algorithm was developed for calling L-indels in targeted sequencing data as following: raw reads were first filtered by selecting high-quality ones and correcting wrong bases; a chunk of contig (unitig) was then assembled and aligned to reference genome; lastly, break point information was collected and L-indels were calculated. The algorithm was applied on the reads generated from NA12878 cell line and FFPE samples collected in our lab respectively on the Illumina platform. Validation were performed by PCR and Sanger sequencing. Results: 22 novel exonic L-indels (17 deletions and 5 insertions) were identified with a median size of 1,616 bp (range: 25-6,684 bp) from NA12878 sequencing data and 100% successfully confirmed by Sanger sequencing. In addition, 6 out of 9 reported L-indels were also found with the rest awaiting for further exploration. Strikingly, a 2,446 bp deletion on MSH6, encodes an important component in mismatch repair (MMR) system, was detected on a FFPE sample of a lung cancer adenocarcinoma, which prompted to consideration of MMR deficiency otherwise. Conclusions: We have developed and validated a novel and accurate method for NGS large indels detection dedicated for targeted sequencing data in clinical cancer setting. Equipment of this method will greatly increase the capability of comprehensively understanding genomic alterations from a single NGS-based assay, and provide more information for potential clinical use.
What problem does this paper attempt to address?