Application of annotation-agnostic RNA sequencing data analysis tools for biomarker discovery in liquid biopsy

Gabriel Wajnberg,Eric P. Allain,Jeremy W. Roy,Shruti Srivastava,Daniel Saucier,Pier Morin,Alier Marrero,Colleen O’Connell,Anirban Ghosh,Stephen M. Lewis,Rodney J. Ouellette,Nicolas Crapoulet
DOI: https://doi.org/10.3389/fbinf.2023.1127661
2023-04-29
Frontiers in Bioinformatics
Abstract:RNA sequencing analysis is an important field in the study of extracellular vesicles (EVs), as these particles contain a variety of RNA species that may have diagnostic, prognostic and predictive value. Many of the bioinformatics tools currently used to analyze EV cargo rely on third-party annotations. Recently, analysis of unannotated expressed RNAs has become of interest, since these may provide complementary information to traditional annotated biomarkers or may help refine biological signatures used in machine learning by including unknown regions. Here we perform a comparative analysis of annotation-free and classical read-summarization tools for the analysis of RNA sequencing data generated for EVs isolated from persons with amyotrophic lateral sclerosis (ALS) and healthy donors. Differential expression analysis and digital-droplet PCR validation of unannotated RNAs also confirmed their existence and demonstrates the usefulness of including such potential biomarkers in transcriptome analysis. We show that find-then-annotate methods perform similarly to standard tools for the analysis of known features, and can also identify unannotated expressed RNAs, two of which were validated as overexpressed in ALS samples. We demonstrate that these tools can therefore be used for a stand-alone analysis or easily integrated into current workflows and may be useful for re-analysis as annotations can be integrated post hoc .
What problem does this paper attempt to address?