Abstract:Abstract Background As a common and abundant RNA methylation modification, N6-methyladenosine (m 6 A) is widely spread in various species' transcriptomes, and it is closely related to the occurrence and development of various life processes and diseases. Thus, accurate identification of m 6 A methylation sites has become a hot topic. Most biological methods rely on high-throughput sequencing technology, which places great demands on the sequencing library preparation and data analysis. Thus, various machine learning methods have been proposed to extract various types of features based on sequences, then occupied conventional classifiers, such as SVM, RF, etc., for m 6 A methylation site identification. However, the identification performance relies heavily on the extracted features, which still need to be improved. Results This paper mainly studies feature extraction and classification of m 6 A methylation sites in a natural language processing way, which manages to organically integrate the feature extraction and classification simultaneously, with consideration of upstream and downstream information of m 6 A sites. One-hot, RNA word embedding, and Word2vec are adopted to depict sites from the perspectives of the base as well as its upstream and downstream sequence. The BiLSTM model, a well-known sequence model, was then constructed to discriminate the sequences with potential m 6 A sites. Since the above-mentioned three feature extraction methods focus on different perspectives of m 6 A sites, an ensemble deep learning predictor (EDLm 6 APred) was finally constructed for m 6 A site prediction. Experimental results on human and mouse data sets show that EDLm 6 APred outperforms the other single ones, indicating that base, upstream, and downstream information are all essential for m 6 A site detection. Compared with the existing m 6 A methylation site prediction models without genomic features, EDLm 6 APred obtains 86.6% of the area under receiver operating curve on the human data sets, indicating the effectiveness of sequential modeling on RNA. To maximize user convenience, a webserver was developed as an implementation of EDLm 6 APred and made publicly available at www.xjtlu.edu.cn/biologicalsciences/EDLm6APred . Conclusions Our proposed EDLm 6 APred method is a reliable predictor for m 6 A methylation sites.

m6Aminer: Predicting the m6Am Sites on mRNA by Fusing Multiple Sequence-Derived Features into a CatBoost-Based Classifier

TargetM6A: Identifying N6-Methyladenosine Sites from RNA Sequences Via Position-Specific Nucleotide Propensities and a Support Vector Machine

M6AMRFS: Robust Prediction of N6-Methyladenosine Sites with Sequence-Based Features in Multiple Species.

A Combined Deep Learning Framework for Mammalian M6a Site Prediction

EDLm6APred: ensemble deep learning approach for mRNA m6A site prediction

DLm6Am: A Deep-Learning-Based Tool for Identifying N6,2'-O-Dimethyladenosine Sites in RNA Sequences

Identifying N6-methyladenosine Sites Using Extreme Gradient Boosting System Optimized by Particle Swarm Optimizer.

EMDL_m6Am: identifying N6,2'-O-dimethyladenosine sites based on stacking ensemble deep learning

M6APred-EL: A Sequence-Based Predictor for Identifying N6-methyladenosine Sites Using Ensemble Learning

m6AGE: A Predictor for N6-Methyladenosine Sites Identification Utilizing Sequence Characteristics and Graph Embedding-Based Geometrical Information

Sramp: Prediction of Mammalian N-6-Methyladenosine (M(6)A) Sites Based on Sequence-Derived Features

Integration of deep feature representations and handcrafted features to improve the prediction of N6-methyladenosine sites.

Comprehensive Review and Assessment of Computational Methods for Prediction of N6-Methyladenosine Sites

MST-m6A: A Novel Multi-Scale Transformer-based Framework for Accurate Prediction of M6a Modification Sites Across Diverse Cellular Contexts

Deepm6A-MT: A deep learning-based method for identifying RNA N6-methyladenosine sites in multiple tissues

Pm6 A: an Integrated Classification Algorithm for 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) Identifying M6 A Sites

TS-m6A-DL: Tissue-specific Identification of N6-methyladenosine Sites Using a Universal Deep Learning Model.

BLAM6A-Merge: Leveraging Attention Mechanisms and Feature Fusion Strategies to Improve the Identification of RNA N6-methyladenosine Sites

AI-m6ARS: Machine learning-driven m6A RNA methylation site discovery with integrated sequence, conservation, and geographical descriptors

LITHOPHONE: Improving Lncrna Methylation Site Prediction Using an Ensemble Predictor.

HSM6AP: a high-precision predictor for the Homo sapiens N6-methyladenosine (m^6 A) based on multiple weights and feature stitching.