Transposable Element (TE) insertion predictions from RNAseq inputs and TE impact on RNA splicing and gene expression in Drosophila brain transcriptomes

Md Fakhrul Azad,Tong Tong,Nelson C. Lau
DOI: https://doi.org/10.1186/s13100-024-00330-z
2024-10-10
Mobile DNA
Abstract:Recent studies have suggested that Transposable Elements (TEs) residing in introns frequently splice into and alter primary gene-coding transcripts. To re-examine the exonization frequency of TEs into protein-coding gene transcripts, we re-analyzed a Drosophila neuron circadian rhythm RNAseq dataset and a deep long RNA fly midbrain RNAseq dataset using our Transposon Insertion and Depletion Analyzer (TIDAL) program. Our TIDAL results were able to predict several TE insertions from RNAseq data that were consistent with previous published studies. However, we also uncovered many discrepancies in TE-exonization calls, such as reads that mainly support intron retention of the TE and little support for chimeric mRNA spliced to the TE. We then deployed rigorous genomic DNA-PCR (gDNA-PCR) and RT-PCR procedures on TE-mRNA fusion candidates to see how many of bioinformatics predictions could be validated. By testing a w1118 strain from which the deeper long RNAseq data was derived and comparing to an OreR strain, only 9 of 23 TIDAL candidates (< 40%) could be validated as a novel TE insertion by gDNA-PCR, indicating that deeper study is needed when using RNAseq data as inputs into current TE-insertion prediction programs. Of these validated calls, our RT-PCR results only supported TE-intron retention. Lastly, in the Dscam2 and Bx genes of the w1118 strain that contained intronic TEs, gene expression was 23 times higher than the OreR genes lacking the TEs. This study's validation approach indicates that chimeric TE-mRNAs are infrequent and cautions that more optimization is required in bioinformatics programs to call TE insertions using RNAseq datasets.
genetics & heredity
What problem does this paper attempt to address?