LncRNA-Protein Interaction Prediction Based on Regularized Nonnegative Matrix Factorization and Sequence Information

Da Xu,Hanxiao Xu,Yusen Zhang,Wei Chen,Rui Gao
2021-01-01
Abstract:lncRNA affects the expression of nearby protein-coding genes and interfaces with related RNA binding proteins to exert functions. It is necessary to develop new computational models, which can reduce the cost and time of the biological experiments and select the most promising lncRNA-protein pairs for experimental validation. In this work, we propose a novel model called LPI-RNMF to identify the lncRNA-protein interaction (LPI) by using a new regularized nonnegative matrix factorization (RNMF) algorithm. First, LPI-RNMF extracts integrated lncRNA and protein similarity matrixes by sequences-based normalized Smith-Waterman score and known lncRNA-protein association matrix-based Gaussian interaction profile kernel, respectively. Then, a new regularized nonnegative matrix factorization algorithm is proposed and utilized to predict potential interactions. We conduct 5-fold cross-validation experiments on the benchmark data set, the AUC value is 0.9102 and AUPR value is 0.7245. In addition, leave-one-out cross-validation (LOOCV) is implemented and the AUC value is 0.9210. The comparison results are significantly higher than other methods mentioned. Moreover, case studies and implementing a test on a novel data set also demonstrate the stable performance of our method. These experimental results suggest that LPI-RNMF is a useful tool in predicting unknown lncRNA-protein interactions.
What problem does this paper attempt to address?