Abstract:MicroRNAs play a crucial role in post-transcriptional regulation, influencing over 60% of human protein-coding genes by targeting specific mRNA sites to suppress protein translation. Various predictive algorithms aim to discern potential microRNA-mRNA pairs. Current approaches primarily employ sequence alignment, machine learning, and deep learning, yet encounter challenges such as complex data pre-processing, time-consuming model generation, and limited binding site precision. Additionally, some methods rely on inefficient RNN-based models, resulting in sluggish predictions. To address these issues, we proposed a CNN-based algorithm combined with transfer learning for direct and precise prediction of microRNA-mRNA binding sites, eliminating the need for extensive preprocessing. We introduced two models in this study: the per-based model and the miRNA-target binding decision model. The former screens potential target sites on 3'-UTR sequences, while the latter guides the decision-making process for miRNA-target pairs. The per-based model utilized a public database, extracting 786,447 human microRNA-mRNA pairs verified by CLIP-seq. It employed sequence alignment approaches to determine putative binding sites and per-base binding states. MicroRNA sequences and seed regions served as the initial convolutional kernels in the deep learning model, combined with encoded full-length 3'-UTR sequences of mRNAs as inputs for the fine-tuned U-Net architecture. For the miRNA-target binding decision model, we excluded the per-based model's decoder and initialized a new classification layer for target-site prediction. Leveraging experimental validation datasets from public databases, we extracted 2,846 binding and 1,058 non-binding human microRNA-mRNA pairs. The pre-trained model was fine-tuned on these 3,904 pairs. Both models underwent training with cycle learning rate, focal loss, gradient clipping, and weighted decay to address dataset imbalances. The dataset was split into 80% training and 20% testing data, with balanced accuracy as the evaluation metric. The per-based model achieved a robust 82.14% balanced accuracy on the test data, excelling in handling imbalanced datasets in per-based tasks. It swiftly and accurately identified nucleotide binding states. The miRNA-target binding decision model outperformed existing methods with a balanced accuracy of 80.39% on the test data. In contrast to many deep learning methods requiring additional preprocessing, our algorithm directly predicts per-base binding states from full-length sequences. It seamlessly transfers knowledge from the per-based to the miRNA-target model. Importantly, our approach relies solely on seed regions, eliminating the need for prior knowledge and enhancing microRNA target prediction reliability. This advancement holds promise for biomedical and clinical researchers, offering valuable insights. Citation Format: Chen-Hao Peng, Hui-Yu Chen, Da-Chuan Cheng, Eric Y. Chuang, Chien-Yueh Lee. A CNN-based approach with efficient transfer learning improves microRNA-mRNA prediction [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular s); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl) nr 3513.

MiRNN: an Improved Prediction Model of MicroRNA Precursors Using Gated Recurrent Units.

deepMiRGene: Deep Neural Network based Precursor microRNA Prediction

Mirmat: Mature Microrna Sequence Prediction

miRe2e: a full end-to-end deep model based on transformers for prediction of pre-miRNAs

GRBMTI: A Multi-Feature Fusion Approach Combining GraRep and RNA2vec for MiRNA-MRNA Interaction Prediction

Abstract 3513: A CNN-based approach with efficient transfer learning improves microRNA-mRNA prediction

LncMirNet: Predicting LncRNA–miRNA Interaction Based on Deep Learning of Ribonucleic Acid Sequences

Mirinho: An efficient and general plant and animal pre-miRNA predictor for genomic and deep sequencing data

BP Neural Network Could Help Improve Pre-miRNA Identification in Various Species.

Deep Neural Network Based Precursor microRNA Prediction on Eleven Species

Microrna Prediction Using A Fixed-Order Markov Model Based On The Secondary Structure Pattern

PMirP: A Pre-Microrna Prediction Method Based on Structure-Sequence Hybrid Features

GCNCMI: A Graph Convolutional Neural Network Approach for Predicting Circrna-Mirna Interactions.

mirExplorer: Detecting microRNAs from genome and next generation sequencing data using the AdaBoost method with transition probability matrix and combined features

MiRTDL: A Deep Learning Approach for Mirna Target Prediction

Premli: a Pre-Trained Method to Uncover Microrna-Lncrna Potential Interactions.

MaturePred: Efficient Identification of MicroRNAs Within Novel Plant Pre-miRNAs

miRLocator: Machine Learning-Based Prediction of Mature MicroRNAs within Plant Pre-miRNA Sequences

Prediction of Mirna Based on Mirna Biogenesis Via One-class SVM

MiRenSVM: towards better prediction of microRNA precursors using an ensemble SVM classifier with multi-loop features