Comprehensive analysis of full-length transcripts reveal aberrations of splicing variants in liver cancer

Hiroki Kiyose,Hidewaki Nakagawa,Atsushi Ono,Hiroshi Aikata,Masaki Ueno,Shinya Hayami,Hiroki Yamaue,Kazuaki Chayama,Mihoko Shimada,Jing Hao Wong,Akihiro Fujimoto
DOI: https://doi.org/10.1101/2021.06.28.450266
2021-06-30
Abstract:Abstract Genes generate various transcripts by alternative splicing, and these transcripts can have diverse functions. However, in most transcriptome studies, short-reads sequencing technologies (next-generation sequencers) have been used and full-length transcripts have not been observed directly. Although long-reads sequencing technologies would enable us to sequence full-length transcripts, analysis of the data is a difficult task. In the present study, we developed an analysis pipeline named SPLICE to analyze full-length cDNA sequences. Using this method, we analyzed cDNA sequences from 42 pairs of hepatocellular carcinoma (HCC) and matched non-cancerous liver with Oxford Nanopore technology. Our analysis detected 46,663 transcripts from the protein-coding genes in the HCCs and the matched non-cancerous livers, of which 5,366 (11.5 %) were novel. Comparison of expression levels identified 9,933 differentially expressed transcripts (DETs) in 4,744 genes. Importantly, 746 genes with DET were not found by the gene-level analysis. We also identified novel exons derived from transposable elements (TEs). In the analysis of transcripts from hepatitis B virus (HBV), HBx-human TE fusions were found to be overexpressed in the HCCs. Furthermore, fusion gene detection showed novel recurrent fusion events. These results suggest that long-reads sequencing technologies allow us to analyze full-length transcripts, and show the importance of splicing variants in carcinogenesis.
What problem does this paper attempt to address?