Robust Inductive Matrix Completion Strategy to Explore Associations Between LincRNAs and Human Disease Phenotypes

Ashis Kumer Biswas,Dong-Chul Kim,Mingon Kang,Jean X. Gao
DOI: https://doi.org/10.1109/TCBB.2018.2844816
2019-12-06
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Abstract:Over the past few years, it has been established that a number of long intergenic non-coding RNAs (lincRNAs) are linked to a wide variety of human diseases. The relationship among many other lincRNAs still remains as puzzle. Validation of such link between the two entities through biological experiments is expensive. However, piles of information about the two are becoming available, thanks to the High Throughput Sequencing (HTS) platforms, Genome Wide Association Studies (GWAS), etc., thereby opening opportunity for cutting-edge machine learning and data mining approaches. However, there are only a few in silico lincRNA-disease association inference tools available to date, and none of these utilizes side information of both the entities. The recently developed Inductive Matrix Completion (IMC) technique provides a recommendation platform among two entities considering respective side information. But, the formulation of IMC is incapable of handling noise and outliers that may present in the dataset, while data sparsity consideration is another issue with the standard IMC method. Thus, a robust version of IMC is needed that can solve these two issues. As a remedy, in this paper, we propose Robust Inductive Matrix Completion (RIMC) using $\ell _{2,1}$ ℓ 2 , 1 norm loss function as well as $\ell _{2,1}$ ℓ 2 , 1 norm based regularization. We applied RIMC to the available association data between human lincRNAs and OMIM disease phenotypes as well as a diverse set of side information about the lincRNAs and the diseases. Our method performs better than the state-of-the-art methods in terms of
What problem does this paper attempt to address?