Abstract:Abstract Motivation Long non-coding RNAs (lncRNAs) are generally expressed in a tissue-specific way, and subcellular localizations of lncRNAs depend on the tissues or cell lines that they are expressed. Previous computational methods for predicting subcellular localizations of lncRNAs do not take this characteristic into account, they train a unified machine learning model for pooled lncRNAs from all available cell lines. It is of importance to develop a cell-line-specific computational method to predict lncRNA locations in different cell lines. Results In this study, we present an updated cell-line-specific predictor lncLocator 2.0, which trains an end-to-end deep model per cell line, for predicting lncRNA subcellular localization from sequences. We first construct benchmark datasets of lncRNA subcellular localizations for 15 cell lines. Then we learn word embeddings using natural language models, and these learned embeddings are fed into convolutional neural network, long short-term memory and multilayer perceptron to classify subcellular localizations. lncLocator 2.0 achieves varying effectiveness for different cell lines and demonstrates the necessity of training cell-line-specific models. Furthermore, we adopt Integrated Gradients to explain the proposed model in lncLocator 2.0, and find some potential patterns that determine the subcellular localizations of lncRNAs, suggesting that the subcellular localization of lncRNAs is linked to some specific nucleotides. Availabilityand implementation The lncLocator 2.0 is available at www.csbio.sjtu.edu.cn/bioinf/lncLocator2 and the source code can be found at https://github.com/Yang-J-LIN/lncLocator2.

Evaluation of machine learning models that predict lncRNA subcellular localization

LncLSTA: A Versatile Predictor Unveiling Subcellular Localization of Lncrnas Through Long-Short Term Attention

Data from Dissecting LncRNA Roles in Renal Cell Carcinoma Metastasis and Characterizing Genomic Heterogeneity by Single-Cell RNA-seq

CytoLNCpred - A computational method for predicting cytoplasm associated long-coding RNAs in 15 cell-lines

A Deep Learning Approach to LncRNA Subcellular Localization Using Inexact q-mers

A review on predicting subcellular localization of lncRNA

Prediction of LncRNA Subcellular Localization with Deep Learning from Sequence Features

A BERT-based model for the prediction of lncRNA subcellular localization in Homo sapiens

Predictive models of subcellular localization of long RNAs

lncLocator 2.0: a cell-line-specific subcellular localization predictor for long non-coding RNAs with interpretable deep learning

Towards a better prediction of subcellular location of long non-coding RNA

lncLocPred: Predicting LncRNA Subcellular Localization Using Multiple Sequence Feature Information

LncSTPred: a predictive model of lncRNA subcellular localization and decipherment of the biological determinants influencing localization

Predicting LncRNA Subcellular Localization Using Unbalanced Pseudo-k Nucleotide Compositions

LncLocation: Efficient Subcellular Location Prediction of Long Non-Coding RNA-Based Multi-Source Heterogeneous Feature Fusion

The lncLocator: a subcellular localization predictor for long non-coding RNAs based on a stacked ensemble classifier

KD-KLNMF: Identification of lncRNAs subcellular localization with multiple features and nonnegative matrix factorization

Expression Levels of Lncrnas Are Prognostic for Hepatocellular Carcinoma Overall Survival.