Abstract:BackgroundMicroRNAs (miRNAs) are short regulatory RNAs derived from longer precursor RNAs. miRNA biogenesis has been studied in animals and plants, recently elucidating more complex aspects, such as non-conserved, species-specific, and heterogeneous miRNA precursor populations. Small RNA sequencing data can help in computationally identifying genomic loci of miRNA precursors. The challenge is to predict a valid miRNA precursor from inhomogeneous read coverage from a complex RNA library: while the mature miRNA typically produces many sequence reads, the remaining part of the precursor is covered very sparsely. As recent results suggest, alternative miRNA biogenesis pathways may lead to a more diverse miRNA precursor population than previously assumed. In plants, the latter manifests itself in e.g. complex secondary structures and expression from multiple loci within precursors. Current miRNA identification algorithms often depend on already existing gene annotation, and/or make use of specific miRNA precursor features such as precursor lengths, secondary structures etc. Consequently and in view of the emerging new understanding of a more complex miRNA biogenesis in plants, current tools may fail to characterise organism-specific and heterogeneous miRNA populations.ResultsmiRA is a new tool to identify miRNA precursors in plants, allowing for heterogeneous and complex precursor populations. miRA requires small RNA sequencing data and a corresponding reference genome, and evaluates precursor secondary structures and precursor processing accuracy; key parameters can be adapted based on the specific organism under investigation. We show that miRA outperforms the currently best plant miRNA prediction tools both in sensitivity and specificity, for data involving Arabidopsis thaliana and the Volvocine algae Chlamydomonas reinhardtii; the latter organism has been shown to exhibit a heterogeneous and complex precursor population with little cross-species miRNA sequence conservation, and therefore constitutes an ideal model organism. Furthermore we identify novel miRNAs in the Chlamydomonas-related organism Volvox carteri.ConclusionsWe propose miRA, a new plant miRNA identification tool that is well adapted to complex precursor populations. miRA is particularly suited for organisms with no existing miRNA annotation, or without a known related organism with well characterized miRNAs. Moreover, miRA has proven its ability to identify species-specific miRNAs. miRA is flexible in its parameter settings, and produces user-friendly output files in various formats (pdf, csv, genome-browser-suitable annotation files, etc.). It is freely available at https://github.com/mhuttner/miRA.

miWords: transformer-based composite deep learning for highly accurate discovery of pre-miRNA regions across plant genomes

miRNA Digger: a comprehensive pipeline for genome-wide novel miRNA mining

A Reversed Framework for the Identification of Microrna-Target Pairs in Plants.

Identification of novel microRNA-like-coding sites on the long-stem microRNA precursors in Arabidopsis.

High-throughput Degradome Sequencing Can Be Used to Gain Insights into Microrna Precursor Metabolism

Toward microRNA-mediated gene regulatory networks in plants.

Mtide: An Integrated Tool For The Identification Of Mirna-Target Interaction In Plants

Methodological framework for functional characterization of plant microRNAs.

PmliPred: a method based on hybrid model and fuzzy decision for plant miRNA-lncRNA interaction prediction.

A Bioinformatics Pipeline to Accurately and Efficiently Analyze the MicroRNA Transcriptomes in Plants.

Mirdeep-P: A Computational Tool for Analyzing the Microrna Transcriptome in Plants

Mirdeep-P2: Accurate and Fast Analysis of the Microrna Transcriptome in Plants

Expression-Based Functional Investigation Of The Organ-Specific Micrornas In Arabidopsis

Pmirkb: A Plant Microrna Knowledge Base

miRA: adaptable novel miRNA identification in plants using small RNA sequencing data

Analyzing the microRNA Transcriptome in Plants Using Deep Sequencing Data.

miRPlant: an integrated tool for identification of plant miRNA from RNA sequencing data

Characterization of statistical features for plant microRNA prediction

Mirinho: An efficient and general plant and animal pre-miRNA predictor for genomic and deep sequencing data

miRe2e: a full end-to-end deep model based on transformers for prediction of pre-miRNAs

Advancing microRNA Target Site Prediction with Transformer and Base-Pairing Patterns