Abstract:Abstract The characterization of the landscape of genetic lesions that underlie cancer has been significantly advanced with the recent application of next-generation sequencing (NGS) technology. This methodology can be used to sequence selected subsets of genes, the whole exome, the whole genome, or the expressed transcriptome within a cancer cell. By comparing the acquired sequences from both a cancer and matched normal tissue sample from the same patients, one should be able to identify almost all somatic lesions within the cancer. As part of the St Jude Children's Research Hospital - Washington University Pediatric Cancer Genome Project (PCGP), we have undertaken the approach of performing whole genome sequencing (WGS) on 600 pediatric cancers and matched control tissue (1200 total genomes). Although the acquisition of the primary sequence is a formidable challenge, the analysis of these data is where the real work begins. Unfortunately, the majority of published NGS analysis methods were developed to identify germ line variation and therefore perform sub-optimally when applied to the task of identifying somatic mutations in cancer genomes. This is in part a result of the distinct difference in logic that must be used to accurately identify all somatic lesions within a cancer. A cancer genome typically exists within a heterogenous DNA sample that is composed of normal cells admixed with an oligoclonal tumor sample. Moreover, the range of somatic lesions seen in cancer is broader than what is seen as part of germ line genetic variation, with some cancers having exceedingly complex genomes containing focal insertions, deletions, inversions, intra-chromosomal and inter-chromosomal rearrangements and large copy number abnormalities. The accurate identification of these lesions requires not only the presence of the lesions within the cancer DNA, but also their absence from the matched germ line sample. To approach these problems, we, as well as others, have recently developed new analytical approaches to enhance our ability to identify the somatic mutations in cancer. The starting point for these analyses is ≥75 bp paired-end sequencing reads from patient matched tumor and normal DNA samples. Our goal is to identify all somatic single-nucleotide variation (SNV), small insertion/deletion (indel), copy number alteration (CNA) and structural variation (SV) that occur within the cancer DNA sample. Paired tumor-normal NGS data were analyzed together to ensure sensitivity for detecting DNA alterations in tumor and for confirming their absence in the matched normal sample. Somatic lesions initially identified by mapped NGS reads were further analyzed using more accurate algorithms to correct errors cause by suboptimal NGS mapping. The sensitivity of the methods we have developed depends on the read depth, but with WGS at 30X haploid coverage we are able to detect mono-allelic mutation present in as low as ∼25% of the analyzed cellular populations. This sensitivity can be significantly enhanced with greater read densities. Key among the methods we have developed are two new algorithms focused on identifying gross DNA alterations: CREST (Clipping REveals STructure) for SV analysis and CONSERTING (COpy Number SEgmentation by Regression Tree) for CNA analysis. CREST uses sequencing reads with partial alignments to the reference human genome (so-called soft-clipped reads) to directly map the breakpoints of somatic SVs. CONSERTING integrates read depth analysis with SV detection and adjust for sequencing artifacts, coverage bias and germ line CNVs. Together, these methods identify somatic lesions with a high validation rate (92-98% of SNV and Indels, 80% for SVs). In this talk, I will highlight the NGS analytic pipeline we have developed and the recent discoveries that have emerged through its application to pediatric cancer genomes. In addition, I will point out some of the significant challenges that remain to be tacked in order for us to identify the full landscape and functional consequences of the somatic mutations in cancer. Citation Format: {Authors}. {Abstract title} [abstract]. In: Proceedings of the 103rd Annual Meeting of the American Association for Cancer Research; 2012 Mar 31-Apr 4; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2012;72(8 Suppl):Abstract nr SY25-01. doi:1538-7445.AM2012-SY25-01

Germline And Somatic Variant Identification Using Bgiseq-500 And Hiseq X Ten Whole Genome Sequencing

Whole Genome and Exome Sequencing Reference Datasets from a Multi-Center and Cross-Platform Benchmark Study

Cross-platform Comparisons for Targeted Bisulfite Sequencing of MGISEQ-2000 and NovaSeq6000

Whole Genome Paired End Sequencing Identifies Genomic Evolution In Myeloma.

Comprehensive detection of germline variants by MSK-IMPACT, a clinical diagnostic platform for solid tumor molecular oncology and concurrent cancer predisposition testing

Abstract 2628: Molecular Diagnosis for Pediatric Cancer Through Integrative Analysis of Whole-Genome, Whole-Exome and Transcriptome Sequencing

An Analysis of the Sensitivity of Proteogenomic Mapping of Somatic Mutations and Novel Splicing Events in Cancer

Abstract SY25-01: Analysis of Next-Generation Sequencing Data for Cancer Genomes: Challenges and Pitfalls

A New Massively Parallel Nanoball Sequencing Platform for Whole Exome Research

Establishing community reference samples, data and call sets for benchmarking cancer mutation detection using whole-genome sequencing

Large-scale analysis of whole genome sequencing data from formalin-fixed paraffin-embedded cancer specimens demonstrates preservation of clinical utility

Abstract 2946: Characterizing the genomic landscapes of breast and lung tumors using cost-effective whole genome sequencing

Toward best practice in cancer mutation detection with whole-genome and whole-exome sequencing

Somatic Point Mutation Calling in Low Cellularity Tumors

Prospective Clinical Sequencing of Adult Glioma.

MGA-seq: robust identification of extrachromosomal DNA and genetic variants using multiple genetic abnormality sequencing

Severus: accurate detection and characterization of somatic structural variation in tumor genomes using long reads

Improving somatic exome sequencing performance by biological replicates

Abstract 2934: Somatic variant workflow with HiFi sequencing provides new insights in highly challenging cancer cases

A Reference Human Genome Dataset of the BGISEQ-500 Sequencer

[Comparison of Different Massive Parallel Sequencing Platforms for Mutation Profiling in Formalin-Fixed and Paraffin-Embedded Samples].