Pan-cancer study of heterogeneous RNA aberrations
Nuno A. Fonseca,André Kahles,Kjong-Van Lehmann,Claudia Calabrese,A. Chateigner,Natalie R Davidson,Deniz Demircioğlu,Yao He,Fabien C. Lamaze,Siliang Li,Dongbing Liu,Fenglin Liu,M. Perry,Hong Su,Linda Xiang,Junjun Zhang,Samirkumar Amin,Peter Bailey,Brian Craft,Milana Frenkel-Morgenstern,Mary Goldman,Liliana Greger,Katherine A. Hoadley,Yong Hou,Ekta Khurana,Jan O. Korbel,Chang Li,Xiaobo Li,Xinyue Li,Xingmin Liu,Youyong Lu,Maximillian G Marin,Matthew Meyerson,Tannistha Nandi,Morten Muhlig Nielsen,Akinyemi I. Ojesina,B. F. Francis Ouellette,Qiang Pan-Hammarström,Chandra Sekhar Pedamallu,Jakob Skou Pedersen,Yuichi Shiraishi,Reiner Siebert,Cameron M Soulette,Stefan Stark,Patrick Tan,Bin Tean Teh,Alfonso Valencia,Jian Wang,Rui Xing,Heng Xiong,Sergei Yakneen,Chen Ye,Christina K. Yung,Fan Zhang,Xiuqing Zhang,Liangtao Zheng,Jingchun Zhu,Shida Zhu,Chad J. Creighton,Jonathan Göke,Roland F. Schwarz,Oliver Stegle,Kui Wu,Huangming Yang,Zemin Zhang,Alvis Brazma,Gunnar Rätsch,Angela N. Brooks
DOI: https://doi.org/10.1101/183889
2018-01-01
bioRxiv
Abstract:Pan-cancer studies have transformed our understanding of recurrent somatic mutations that contribute to cancer pathogenesis; however, there has yet to be a full investigation of the multiple mechanisms in which genes can be somatically altered, particularly at the transcriptome level. We present the most comprehensive catalogue of cancer-associated gene alterations through extensive characterization of tumor transcriptomes from 1,188 donors of the Pan-Cancer Analysis of Whole Genomes (PCAWG) project with matched whole-genome sequence data. We processed the RNA-seq data with a unified analysis pipeline that included both sequence alignment and extensive quality control. Subsequently, we identified gene alterations through gene expression, alternative splicing, alternative transcription starts, allele-specific expression, RNA-edited sites, and gene fusions, and by comparing with RNA-Seq from a panel of normal tissue samples from the Genotype-Tissue Expression (GTEx) project. Our data represent an extensive pan-cancer catalog of RNA-level aberrations for each gene and will be the basis for further analyses within PCAWG. NOTE TO READERS: This is a draft of a marker paper from the PCAWG Transcriptome Working Group and is intended to describe technical aspects of RNA-Seq analysis associated with the PCAWG project. The full marker paper is currently in preparation.