Abstract:Gene fusions are important biomarkers for cancer diagnosis, subtype classification and therapeutic decision-making. While fusion detection using RNA-seq data has become a standard practice, existing computational methods primarily focus on identifying canonical exon-to-exon fusions. However, more complex events such as multi-partner fusions, truncations, enhancer hijacking and internal tandem duplications (ITD) can also lead to abnormal function or aberrant transcription of cancer driver genes. To aid discovery of complex and diverse driver fusions, we developed CICERO (CICERO Is Clipping Extended for RNA Optimization), a local assembly-based algorithm that integrates RNA-seq reads bearing aberrant mapping signatures with extensive annotation for ranking candidate fusions. Our benchmark data set, designed to support the main application of RNA-seq fusion analysis, consists of 184 driver fusions from 170 pediatric leukemia, solid tumor and brain tumor detected by paired tumor-normal WGS and orthogonally validated by capture sequencing, RT-PCR and/or FISH. CICERO detected 95% of these fusions with an average ranking of 1.9, whereas ChimeraScan, deFuse, FusionCatcher and STAR-Fusion detected only 63%, 66%, 77% and 63% with an average ranking of 37.0, 9.0, 18.1 and 4.4, respectively. Notably, events such as ITD and rearrangements involving the highly repetitive IGH locus were detected almost exclusively by CICERO. Our re-analysis of 167 RNA-seq data from the TCGA Glioblastoma Multiforme (GBM) cohort unveiled 158 fusions of cancer genes that were not reported previously. These include kinase fusions (KLHL7-BRAF), ITD of EGFR kinase domain and a 13% prevalence of EGFR C-terminal truncation compared to the 6% reported by the TCGA Network. CICERO has greatly improved our ability to discover non-canonical fusions which are overlooked by existing fusion detection methods, and has been used to analyze >2,000 RNA-seq samples generated by the two largest pediatric cancer genomics initiatives: the St. Jude/Washington University Pediatric Cancer Genome Project (PCGP) and the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) project. We anticipate that CICERO will also improve fusion analysis for adult cancer RNA-seq data, as demonstrated through our re-analysis of TCGA-GBM and our recent discovery of MAP3K8 C-terminal truncation fusion in 2% of TCGA melanoma samples. CICERO is accessible via standard (https://github.com/stjude/Cicero) or cloud-based (https://platform.stjude.cloud/tools/rapid_rna-seq) implementation. To further improve accuracy, fusions predicted by CICERO can be curated by FusionEditor (https://proteinpaint.stjude.org/FusionEditor/), an interactive viewer allowing inspection of protein domains involved in the fusion and the gene expression status of fusion-positive samples. Citation Format: Liqing Tian, Yongjin Li, Michael N. Edmonson, Xin Zhou, Scott Newman, Clay McLeod, Yu Liu, Bo Tang, Michael C. Rusch, John Easton, Jing Ma, Austyn Trull, J. Robert Michael, Andrew Thrasher, Charles Mullighan, Suzanne J. Baker, James R. Downing, David W. Ellison, Jinghui Zhang. CICERO: An accurate method for detecting complex and diverse driver fusions using cancer transcriptome sequencing (RNA-seq) data [abstract]. In: Proceedings of the Annual Meeting of the American Association for Cancer Research 2020; 2020 Apr 27-28 and Jun 22-24. Philadelphia (PA): AACR; Cancer Res 2020;80(16 Suppl):Abstract nr 5478.

Comprehensive Evaluation of Fusion Transcript Detection Algorithms and a Meta-Caller to Combine Top Performing Methods in Paired-End RNA-seq Data.

A Novel Analytical Strategy To Identify Fusion Transcripts Between Repetitive Elements And Protein Coding-Exons Using Rna-Seq

FusionQ: a Novel Approach for Gene Fusion Detection and Quantification from Paired-End RNA-Seq.

An optimized workflow of full-length transcriptome sequencing for accurate fusion transcript identification

SOAPfuse: an Algorithm for Identifying Fusion Transcripts from Paired-End RNA-Seq Data

SeekFusion - A Clinically Validated Fusion Transcript Detection Pipeline for PCR-Based Next-Generation Sequencing of RNA

Comprehensive Assessment of Isoform Detection Methods for Third-Generation Sequencing Data

Abstract 7418: LongFuse: Detecting gene fusion transcripts from high throughput long-read single cell RNA sequencing data

CTAT-LR-fusion: accurate fusion transcript identification from long and short read isoform sequencing at bulk or single cell resolution

LongGF: Computational Algorithm and Software Tool for Fast and Accurate Detection of Gene Fusions by Long-Read Transcriptome Sequencing

Abstract LB-212: FCRF: an Efficient Algorithm for Detecting Circular Fusion Transcript from RNA-Seq Data

GeneScissors: a comprehensive approach to detecting and correcting spurious transcriptome inference owing to RNA-seq reads misalignment.

1148P Identification and Validation of RET Fusions in Lung Adenocarcinoma Through DNA and RNA Sequencing

Fusion Transcripts And Transcribed Retrotransposed Loci Discovered Through Comprehensive Transcriptome Analysis Using Paired-End Ditags (Pets)

Fcirc: A Comprehensive Pipeline for the Exploration of Fusion Linear and Circular RNAs

Abstract 465: Development and validation of a targeted RNA-Seq assay for gene fusion detection and expression quantification in FFPE samples

IFDlong: an isoform and fusion detector for accurate annotation and quantification of long-read RNA-seq data

A comprehensive benchmarking of differential splicing tools for RNA-seq analysis at the event level

Cicero: An Accurate Method For Detecting Complex And Diverse Driver Fusions Using Cancer Transcriptome Sequencing (Rna-Seq) Data

Single-cell gene fusion detection by scFusion

Comparison of four next generation sequencing platforms for fusion detection: Oncomine by ThermoFisher, AmpliSeq by illumina, FusionPlex by ArcherDX, and QIAseq by QIAGEN