A New LSTM-Based Gene Expression Prediction Model: L-GEPM
Huiqing Wang,Chun Li,Jianhui Zhang,Jingjing Wang,Yue Ma,Yuanyuan Lian
DOI: https://doi.org/10.1142/S0219720019500227
2019-01-01
Journal of Bioinformatics and Computational Biology
Abstract:Molecular biology combined with in silico machine learning and deep learning has facilitated the broad application of gene expression profiles for gene function prediction, optimal crop breeding, disease-related gene discovery, and drug screening. Although the acquisition cost of genome-wide expression profiles has been steadily declining, the requirement generates a compendium of expression profiles using thousands of samples remains high. The Library of Integrated Network-Based Cellular Signatures (LINCS) program used approximately 1000 landmark genes to predict the expression of the remaining target genes by linear regression; however, this approach ignored the nonlinear features influencing gene expression relationships, limiting the accuracy of the experimental results. We herein propose a gene expression prediction model, L-GEPM, based on long short-term memory (LSTM) neural networks, which captures the nonlinear features affecting gene expression and uses learned features to predict the target genes. By comparing and analyzing experimental errors and fitting the effects of different prediction models, the LSTM neural network-based model, L-GEPM, can achieve low error and a superior fitting effect.