Pan-cancer Identification of the Relationship of Metabolism-Related Differentially Expressed Transcription Regulation with Non-Differentially Expressed Target Genes Via a Gated Recurrent Unit Network.

Meiyu Duan,Yueying Wang,Ya Qiao,Yangyang Wang,Xingyuan Pan,Zhuyu Hu,Yanyue Ran,Xian Fu,Yusi Fan,Lan Huang,Fengfeng Zhou
DOI: https://doi.org/10.1016/j.compbiomed.2022.105883
IF: 7.7
2022-01-01
Computers in Biology and Medicine
Abstract:The transcriptome describes the expression of all genes in a sample. Most studies have investigated the differential patterns or discrimination powers of transcript expression levels. In this study, we hypothesized that the quantitative correlations between the expression levels of transcription factors (TFs) and their regulated target genes (mRNAs) serve as a novel view of healthy status, and a disease sample exhibits a differential landscape (mqTrans) of transcription regulations compared with healthy status. We formulated quantitative transcription regulation relationships of metabolism-related genes as a multi-input multi-output regression model via a gated recurrent unit (GRU) network. The GRU model was trained using healthy blood transcriptomes and the expression levels of mRNAs were predicted by those of the TFs. The mqTrans feature of a gene was defined as the difference between its predicted and actual expression levels. A pan-cancer investigation of the differentially expressed mqTrans features was conducted between the early- and late-stage cancers in 26 cancer types of The Cancer Genome Atlas database. This study focused on the differentially expressed mqTrans features, that did not show differential expression in the actual expression levels. These genes could not be detected by conventional differential analysis. Such dark biomarkers are worthy of further wet-lab investigation. The experimental data also showed that the proposed mqTrans investigation improved the classification between early- and late-stage samples for some cancer types. Thus, the mqTrans features serve as a complementary view to transcriptomes, an OMIC type with mature high-throughput production technologies, and abundant public resources.
What problem does this paper attempt to address?