A practical application of generative adversarial networks for RNA-seq analysis to predict the molecular progress of Alzheimer's disease

Jinhee Park,Hyerin Kim,Jaekwang Kim,Mookyung Cheon
DOI: https://doi.org/10.1371/journal.pcbi.1008099
2020-07-24
PLoS Computational Biology
Abstract:Next-generation sequencing (NGS) technology has become a powerful tool for dissecting the molecular and pathological signatures of a variety of human diseases. However, the limited availability of biological samples from different disease stages is a major hurdle in studying disease progressions and identifying early pathological changes. Deep learning techniques have recently begun to be applied to analyze NGS data and thereby predict the progression of biological processes. In this study, we applied a deep learning technique called generative adversarial networks (GANs) to predict the molecular progress of Alzheimer's disease (AD). We successfully applied GANs to analyze RNA-seq data from a 5xFAD mouse model of AD, which recapitulates major AD features of massive amyloid-β (Aβ) accumulation in the brain. We examined how the generator is featured to have specific-sample generation and biological gene association. Based on the above observations, we suggested virtual disease progress by latent space interpolation to yield the transition curves of various genes with pathological changes from normal to AD state. By performing pathway analysis based on the transition curve patterns, we identified several pathological processes with progressive changes, such as inflammatory systems and synapse functions, which have previously been demonstrated to be involved in the pathogenesis of AD. Interestingly, our analysis indicates that alteration of cholesterol biosynthesis begins at a very early stage of AD, suggesting that it is the first effect to mediate the cholesterol metabolism of AD downstream of Aβ accumulation. Here, we suggest that GANs are a useful tool to study disease progression, leading to the identification of early pathological signatures.We applied a deep learning technique called generative adversarial networks (GANs) to bulk RNA-seq data, where the number of samples is limited but expression profiles are much more reliable than those in single cell method. Like continuous image conversions of human faces commonly used in the recent AI revolution, we introduced virtual Alzheimer's disease progression described by gene expression levels. Our gene expression analysis based on GANs is proposed to capture pathological pathway cascades and sequential orders of gene regulation. Through this convergence study of bioinformatics and AI, we discovered that amyloid-beta production is thought to trigger the cholesterol biosynthesis.
biochemical research methods,mathematical & computational biology
What problem does this paper attempt to address?