Using NMFAS to Identify Key Biological Pathways Associated with Human Diseases

Hao Guo,Yunping Zhu,Dong Li,Fuchu He,Qijun Liu
DOI: https://doi.org/10.1109/isb.2012.6314116
2012-01-01
Systems Biology
Abstract:Gene expression microarray enables us to measure the gene expression levels for thousands of genes at the same time. Here, we constructed the non-negative matrix factorization analysis strategy (NMFAS) to dig the underlying biological pathways related with various diseases by factorizing the pathway expression matrix, which was extracted from microarray matrix using pathway membership information, into the product of row and column vectors. We defined row vector as the pathway activity and column vector as the gene contribution weight. Via comparing the pathway activity of two different sample groups, we can identify significantly expressed pathways. We applied this strategy on two different cases: smoking and type 2 diabetes (DM2). We found 152 differentially expressed pathways by the comparison of pathway activity between smoker and never smoker, including pathways that have been validated in literature, such as “O-Glycans biosynthesis” and “Glutathione metabolism”. We also found important genes related to smoking phenotype, such as NQO, HSPA1A, ALDH3A1. As for DM2 analysis, our results suggested 9 pathways were significantly expressed, including typical pathways like “Oxidative phosphorylation” and “mTOR signaling pathway”, and found genes like CAPNS1, APP, COX7A1, COX7B, which might play important roles in the cellular regulations of DM2. In conclusion, Our strategy can be efficiently used to integrate gene expression profiles and biological pathway information to identify the key processes underlying human disease and can identify gene pathways missed by alternative approaches.
What problem does this paper attempt to address?