Additional File 2: of Revealing Alzheimer’s Disease Genes Spectrum in the Whole-Genome by Machine Learning

Xiaoyan Huang,Hankui Liu,Xinming Li,Liping Guan,Jiankang Li,Laurent Tellier,Huanming Yang,Jian Wang,Shiping Zhang

DOI: https://doi.org/10.6084/m9.figshare.5773506

2018-01-01

Abstract:335 AD-associated genes. The datasets collected from public Alzheimer’s disease databases (AlzGene) and the publications treating upon AD. (XLSX 14 kb)

What problem does this paper attempt to address?

Additional File 4: of Revealing Alzheimer’s Disease Genes Spectrum in the Whole-Genome by Machine Learning

Xiaoyan Huang,Hankui Liu,Xinming Li,Liping Guan,Jiankang Li,Laurent Tellier,Huanming Yang,Jian Wang,Shiping Zhang

DOI: https://doi.org/10.6084/m9.figshare.5773539

2018-01-01

Abstract:832 AD predicted genes. A total number of AD predicted genes across the whole genome in our study. (XLSX 68 kb)
Revealing Alzheimer’s Disease Genes Spectrum in the Whole-Genome by Machine Learning

Xiaoyan Huang,Hankui Liu,Xinming Li,Liping Guan,Jiankang Li,Laurent Christian Asker M. Tellier,Huanming Yang,Jian Wang,Jianguo Zhang

DOI: https://doi.org/10.1186/s12883-017-1010-3

2018-01-01

BMC Neurology

Abstract:BACKGROUND:Alzheimer's disease (AD) is an important, progressive neurodegenerative disease, with a complex genetic architecture. A key goal of biomedical research is to seek out disease risk genes, and to elucidate the function of these risk genes in the development of disease. For this purpose, expanding the AD-associated gene set is necessary. In past research, the prediction methods for AD related genes has been limited in their exploration of the target genome regions. We here present a genome-wide method for AD candidate genes predictions.METHODS:We present a machine learning approach (SVM), based upon integrating gene expression data with human brain-specific gene network data, to discover the full spectrum of AD genes across the whole genome.RESULTS:We classified AD candidate genes with an accuracy and the area under the receiver operating characteristic (ROC) curve of 84.56% and 94%. Our approach provides a supplement for the spectrum of AD-associated genes extracted from more than 20,000 genes in a genome wide scale.CONCLUSIONS:In this study, we have elucidated the whole-genome spectrum of AD, using a machine learning approach. Through this method, we expect for the candidate gene catalogue to provide a more comprehensive annotation of AD for researchers.
Identifying Shared Diagnostic Genes and Mechanisms in Vascular Dementia and Alzheimer's Disease Via Bioinformatics and Machine Learning

Shu Wan,Wanning Zheng,Dongdong Lin,Shunan Shi,Jiayi Ren,Jiong Wu,Ming Wang

DOI: https://doi.org/10.1177/25424823241289804

2024-01-01

Journal of Alzheimer's Disease Reports

Abstract:Background Alzheimer's disease (AD) and vascular dementia (VaD) share overlapping pathophysiological characteristics, yet comparative genetic studies are rare. Understanding these overlaps may aid in identifying common diagnostic markers and therapeutic targets. Objective This study identifies shared diagnostic genes and mechanisms linking AD and VaD. Methods Datasets GSE5281 and GSE122063 from the GEO database were used to identify differentially expressed genes (DEGs). Intersection DEGs were analyzed using KEGG and GO enrichment to explore signaling pathways. A PPI network was constructed, and LASSO and SVM-RFE were applied to identify core genes. CIBERSORT assessed immune cell composition and their relationship with core genes. Diagnostic efficacy was evaluated using ROC curves, nomogram, and Decision Curve Analysis (DCA). Core genes were used to identify characteristic genes in various brain regions of AD patients. Results The analysis identified 9021 DEGs for AD and 373 DEGs for VaD, with 74 co-expressed genes and 8 core genes. ROC curves, nomogram, and DCA indicated high diagnostic accuracy. Core gene analysis revealed differential expression of characteristic genes in various brain regions of AD patients. Conclusions This research identified 74 co-expressed genes and 8 pivotal diagnostic genes. These genes likely play roles in signal transduction, neuroinflammation, and autophagy in both AD and VaD. The findings offer potential targets for future research and clinical interventions. Further research should use larger, more diverse datasets and incorporate custom NGS panels to identify novel genetic variants, enhancing precise diagnostic and therapeutic strategies.
Human Whole Genome Genotype and Transcriptome Data for Alzheimer’s and Other Neurodegenerative Diseases

Mariet Allen,Minerva M. Carrasquillo,Cory Funk,Benjamin D. Heavner,Fanggeng Zou,Curtis S. Younkin,Jeremy D. Burgess,High-Seng Chai,Julia Crook,James A. Eddy,Hongdong Li,Ben Logsdon,Mette A. Peters,Kristen K. Dang,Xue Wang,Daniel Serie,Chen Wang,Thuy Nguyen,Sarah Lincoln,Kimberly Malphrus,Gina Bisceglio,Ma Li,Todd E. Golde,Lara M. Mangravite,Yan Asmann,Nathan D. Price,Ronald C. Petersen,Neill R. Graff-Radford,Dennis W. Dickson,Steven G. Younkin,Nilüfer Ertekin-Taner

DOI: https://doi.org/10.1038/sdata.2016.89

2016-01-01

Scientific Data

Abstract:Previous genome-wide association studies (GWAS), conducted by our group and others, have identified loci that harbor risk variants for neurodegenerative diseases, including Alzheimer's disease (AD). Human disease variants are enriched for polymorphisms that affect gene expression, including some that are known to associate with expression changes in the brain. Postulating that many variants confer risk to neurodegenerative disease via transcriptional regulatory mechanisms, we have analyzed gene expression levels in the brain tissue of subjects with AD and related diseases. Herein, we describe our collective datasets comprised of GWAS data from 2,099 subjects; microarray gene expression data from 773 brain samples, 186 of which also have RNAseq; and an independent cohort of 556 brain samples with RNAseq. We expect that these datasets, which are available to all qualified researchers, will enable investigators to explore and identify transcriptional mechanisms contributing to neurodegenerative diseases.
Identification of feature genes and pathways for Alzheimer's disease via WGCNA and LASSO regression

Hongyu Sun,Jin Yang,Xiaohui Li,Yi Lyu,Zhaomeng Xu,Hui He,Xiaomin Tong,Tingyu Ji,Shihan Ding,Chaoli Zhou,Pengyong Han,Jinping Zheng

DOI: https://doi.org/10.3389/fncom.2022.1001546

IF: 3.387

2022-09-22

Frontiers in Computational Neuroscience

Abstract:While Alzheimer's disease (AD) can cause a severe economic burden, the specific pathogenesis involved is yet to be elucidated. To identify feature genes associated with AD, we downloaded data from three GEO databases: GSE122063, GSE15222, and GSE138260. In the filtering, we used AD for search keywords, Homo sapiens for species selection, and established a sample size of > 20 for each data set, and each data set contains Including the normal group and AD group. The datasets GSE15222 and GSE138260 were combined as a training group to build a model, and GSE122063 was used as a test group to verify the model's accuracy. The genes with differential expression found in the combined datasets were used for analysis through Gene Ontology (GO) and The Kyoto Encyclopedia of Genes and Genome Pathways (KEGG). Then, AD-related module genes were identified using the combined dataset through a weighted gene co-expression network analysis (WGCNA). Both the differential and AD-related module genes were intersected to obtain AD key genes. These genes were first filtered through LASSO regression and then AD-related feature genes were obtained for subsequent immune-related analysis. A comprehensive analysis of three AD-related datasets in the GEO database revealed 111 common differential AD genes. In the GO analysis, the more prominent terms were cognition and learning or memory. The KEGG analysis showed that these differential genes were enriched not only in In the KEGG analysis, but also in three other pathways: neuroactive ligand-receptor interaction, cAMP signaling pathway, and Calcium signaling pathway. Three AD-related feature genes (SST, MLIP, HSPB3) were finally identified. The area under the ROC curve of these AD-related feature genes was greater than 0.7 in both the training and the test groups. Finally, an immune-related analysis of these genes was performed. The finding of AD-related feature genes (SST, MLIP, HSPB3) could help predict the onset and progression of the disease. Overall, our study may provide significant guidance for further exploration of potential biomarkers for the diagnosis and prediction of AD.

neurosciences,mathematical & computational biology
An Alzheimers Disease Related Genes Identification Method Based on Multiple Classifier Integration

Yu Miao,Huiyan Jiang,Huiling Liu,Yu-dong Yao

DOI: https://doi.org/10.1016/j.cmpb.2017.08.006

IF: 6.1

2017-01-01

Computer Methods and Programs in Biomedicine

Abstract:BACKGROUND AND OBJECTIVE:Alzheimers disease (AD) is a fatal neurodegenerative disease and the onset of AD is insidious. Full understanding of the AD-related genes (ADGs) has not been completed. The National Center for Biotechnology Information (NCBI) provides an AD dataset of 22,283 genes. Among these genes, 71 genes have been identified as ADGs. But there may still be underlying ADGs that have not yet been identified in the remaining 22,212 genes. This paper aims to identify additional ADGs using machine learning techniques.METHODS:To improve the accuracy of ADG identification, we propose a gene identification method through multiple classifier integration. First, a feature selection algorithm is applied to select the most relevant attributes. Second, a two-stage cascading classifier is developed to identify ADGs. The first stage classification task is based on the relevance vector machine and, in the second stage, the results of three classifiers, support vector machine, random forest and extreme learning machine, are combined through voting.RESULTS:According to our results, feature selection improves accuracy and reduces training time. Voting based classifier reduces the classification errors. The proposed ADG identification system provides accuracy, sensitivity and specificity at levels of 78.77%, 83.10% and 74.67%, respectively. Based on the proposed ADG identification method, potentially additional ADGs are identified and top 13 genes (predicted ADGs) are presented.CONCLUSIONS:In this paper, an ADG identification method for identifying ADGs is presented. The proposed method which combines feature selection, cascading classifier and majority voting leads to higher specificity and significantly increases the accuracy and sensitivity of ADG identification. Potentially new ADGs are identified.
Discovering Genetic Signatures Associated with Alzheimer's Disease in Tiled Whole Genome Sequence Data: Results from the Artificial Intelligence for Alzheimer's Disease (AI4AD) Consortium

Sarah W Zaranek,Alexander Wait Zaranek,Peter Amstutz,Jingxuan Bao,Jiong Chen,Tom Clegg,Hannah Craft,Taeho Jo,Brian Lee,Kwangsik Nho,Sophia I Thomopoulos,Christos Davatzikos,Li Shen,Heng Huang,Paul M Thompson,Andrew J Saykin,The Alzheimer's Disease Neuroimaging Initiative as a consortium author for the AI4AD Initiative

DOI: https://doi.org/10.1101/2024.08.01.24311329

2024-08-03

Abstract:Currently, the ability to analyze large-scale whole genome sequence (WGS) data is limited due to both the size of the data and the inability of many existing tools to scale. To address this challenge, we use data "tiling" to efficiently partition whole genome sequences into smaller segments resulting in a simple numeric matrix of small integers. This lossless representation is particularly suitable for machine learning (ML) models. As an example of the benefits of tiling, we showcase results from tiled data as part of the Artificial Intelligence for Alzheimer's Disease (AI4AD) consortium. AI4AD is a coordinated initiative to develop transformative AI approaches for high throughput analysis of next generation sequencing and related imaging, AD biomarker, and cognitive data. The collective effort integrates imaging, genomic, biomarker, and cognitive data to address fundamental barriers in AD prevention and drug discovery. One of the project's initial aims is to discover new genetic signatures in WGS data that can be used to understand AD risk and progression in conjunction with imaging, biomarker and cognitive data. We tiled and analyzed 15,000+ genomes from the Alzheimer's Disease Sequencing Project (ADSP) and the Alzheimer's Disease Neuroimaging Initiative (ADNI). We tile 11,762 genomes, a subset of the release which does not include family-based datasets (AD Cases: 4,983, age range: 50-90 years , mean age: 73.8 years). We illustrate the use of tiled data in ML classification methods to predict phenotypes. Specifically, we identify and prioritize tile variants/genetic variants that are possible genetic signatures for AD. The model shows added predictive value from variants of genes previously found to be associated with AD risk, age of onset, neurofibrillary tangle measurements, and other AD-related traits--including the APOE variant (rs429358).

Genetic and Genomic Medicine
Additional File 5 of Spatially Resolved Transcriptomics Reveals Genes Associated with the Vulnerability of Middle Temporal Gyrus in Alzheimer’s Disease

Shuo Chen,Yuzhou Chang,Liangping Li,Diana Acosta,Yang Li,Qi Guo,Cankun Wang,Emir Turkes,Cody Morrison,Dominic Julian,Douglas W. Scharre,Sarah XueYing Song,Jasmine Plummer,Karen Duff,Ma Qian,Hongjun Fu

DOI: https://doi.org/10.6084/m9.figshare.22601017

2023-01-01

Abstract:Additional file 5: Table S4. The proportion of spots with Aβ+, AT8+, Aβ+/AT8+ pathology which is normalized to the total number of spots in each layer from three AD cases.
Alzheimer's Disease Sequencing Project Release 4 Whole Genome Sequencing Dataset

Yuk Yee Leung,Wan-Ping Lee,Amanda B Kuzma,Heather I Nicaretta,Otto Valladares,Prabhakaran Gangadharan,Liming Qu,Yi Zhao,Ren Youli,Po-Liang Cheng,Pavel P Kuksa,Hui Wang,Heather White,Zivadin Katanic,Lauren Bass,Naveen Saravanan,Emily Greenfest-Allen Greenfest-Allen,Maureen Kirsch,Laura B Cantwell,Taha Iqbal,Nicholas R Wheeler,John J Farrer,Congcong Zhu,Shannon L Turner,Tamil Iniyan Gunasekaran,Pedro R Mena,Jimmy Jin,Luke Carter,Alzheimer's Disease Sequencing Project,Xiaoling Zhang,Badri N Vardarajan,Arthur W Toga,Michael Cuccaro,Timothy J Hohman,William S Bush,Adam C Naj,Eden Martin,Clifton Dalgard,Brian W Kunkle,Lindsay A Farrer,Richard P Mayeux,Jonathan L Haines,Margaret A Pericak-Vance,Gerard D Schellenberg,Li-San Wang

DOI: https://doi.org/10.1101/2024.12.03.24317000

2024-12-06

Abstract:The Alzheimer's Disease Sequencing Project (ADSP) is a national initiative to understand the genetic architecture of Alzheimer's Disease and Related Dementias (AD/ADRD) by sequencing whole genomes of affected participants and age-matched cognitive controls from diverse populations. The Genome Center for Alzheimer's Disease (GCAD) processed whole-genome sequencing data from 36,361 ADSP participants, including 35,014 genetically unique participants of which 45% are from non-European ancestry, across 17 cohorts in 14 countries in this fourth release (R4). This sequencing effort identified 387 million bi-allelic variants, 42 million short insertions/deletions, and 2.2 million structural variants. Annotations and quality control data are available for all variants and samples. Additionally, detailed phenotypes from 15,927 participants across 10 domains are also provided. A linkage disequilibrium panel was created using unrelated AD cases and controls. Researchers can access and analyze the genetic data via NIAGADS Data Sharing Service, the VariXam tool, or NIAGADS GenomicsDB.
The Mount Sinai Cohort of Large-Scale Genomic, Transcriptomic and Proteomic Data in Alzheimer's Disease

Minghui Wang,Noam D. Beckmann,Panos Roussos,Erming Wang,Xianxiao Zhou,Qian Wang,Chen Ming,Ryan Neff,Weiping Ma,John F. Fullard,Mads E. Hauberg,Jaroslav Bendl,Mette A. Peters,Ben Logsdon,Pei Wang,Milind Mahajan,Lara M. Mangravite,Eric B. Dammer,Duc M. Duong,James J. Lah,Nicholas T. Seyfried,Allan I. Levey,Joseph D. Buxbaum,Michelle Ehrlich,Sam Gandy,Pavel Katsel,Vahram Haroutunian,Eric Schadt,Bin Zhang

DOI: https://doi.org/10.1038/sdata.2018.185

2018-01-01

Scientific Data

Abstract:Alzheimer's disease (AD) affects half the US population over the age of 85 and is universally fatal following an average course of 10 years of progressive cognitive disability. Genetic and genome-wide association studies (GWAS) have identified about 33 risk factor genes for common, late-onset AD (LOAD), but these risk loci fail to account for the majority of affected cases and can neither provide clinically meaningful prediction of development of AD nor offer actionable mechanisms. This cohort study generated large-scale matched multi-Omics data in AD and control brains for exploring novel molecular underpinnings of AD. Specifically, we generated whole genome sequencing, whole exome sequencing, transcriptome sequencing and proteome profiling data from multiple regions of 364 postmortem control, mild cognitive impaired (MCI) and AD brains with rich clinical and pathophysiological data. All the data went through rigorous quality control. Both the raw and processed data are publicly available through the Synapse software platform.
Identification of immune microenvironment subtypes and signature genes for Alzheimer's disease diagnosis and risk prediction based on explainable machine learning

Yongxing Lai,Peiqiang Lin,Fan Lin,Manli Chen,Chunjin Lin,Xing Lin,Lijuan Wu,Mouwei Zheng,Jianhao Chen

DOI: https://doi.org/10.3389/fimmu.2022.1046410

IF: 7.3

2022-01-01

Frontiers in Immunology

Abstract:BackgroundUsing interpretable machine learning, we sought to define the immune microenvironment subtypes and distinctive genes in AD. MethodsssGSEA, LASSO regression, and WGCNA algorithms were used to evaluate immune state in AD patients. To predict the fate of AD and identify distinctive genes, six machine learning algorithms were developed. The output of machine learning models was interpreted using the SHAP and LIME algorithms. For external validation, four separate GEO databases were used. We estimated the subgroups of the immunological microenvironment using unsupervised clustering. Further research was done on the variations in immunological microenvironment, enhanced functions and pathways, and therapeutic medicines between these subtypes. Finally, the expression of characteristic genes was verified using the AlzData and pan-cancer databases and RT-PCR analysis. ResultsIt was determined that AD is connected to changes in the immunological microenvironment. WGCNA revealed 31 potential immune genes, of which the greenyellow and blue modules were shown to be most associated with infiltrated immune cells. In the testing set, the XGBoost algorithm had the best performance with an AUC of 0.86 and a P-R value of 0.83. Following the screening of the testing set by machine learning algorithms and the verification of independent datasets, five genes (CXCR4, PPP3R1, HSP90AB1, CXCL10, and S100A12) that were closely associated with AD pathological biomarkers and allowed for the accurate prediction of AD progression were found to be immune microenvironment-related genes. The feature gene-based nomogram may provide clinical advantages to patients. Two immune microenvironment subgroups for AD patients were identified, subtype2 was linked to a metabolic phenotype, subtype1 belonged to the immune-active kind. MK-866 and arachidonyltrifluoromethane were identified as the top treatment agents for subtypes 1 and 2, respectively. These five distinguishing genes were found to be intimately linked to the development of the disease, according to the Alzdata database, pan-cancer research, and RT-PCR analysis. ConclusionThe hub genes associated with the immune microenvironment that are most strongly associated with the progression of pathology in AD are CXCR4, PPP3R1, HSP90AB1, CXCL10, and S100A12. The hypothesized molecular subgroups might offer novel perceptions for individualized AD treatment.
AlzCode: a Platform for Multiview Analysis of Genes Related to Alzheimer's Disease.

Cui-Xiang Lin,Hong-Dong Li,Chao Deng,Shannon Erhardt,Jun Wang,Xiaoqing Peng,Jianxin Wang

DOI: https://doi.org/10.1093/bioinformatics/btac033

IF: 5.8

2022-01-01

Bioinformatics

Abstract:Motivation: Alzheimer's disease (AD) is a complex brain disorder with risk genes incompletely identified. The candidate genes are dominantly obtained by computational approaches. In order to obtain biological insights of candidate genes or screen genes for experimental testing, it is essential to assess their relevance to AD. A platform that integrates different types of omics data and approaches would facilitate the analysis of candidate genes and is in great need. Results: We report AlzCode, a platform for multiview analysis of genes related to AD. First, this platform integrates a rich collection of functional genomic data, including expression data of AD samples (gene expression, single-cell RNA-seq data and protein expression), AD-specific biological networks (co-expression networks and functional gene networks), neuropathological and clinical traits (CERAD score, Braak staging score, Clinical Dementia Rating, cognitive function and clinical severity) and general data such as protein-protein interaction, regulatory networks, sequence similarity and miRNA-target interactions. These data provide basis for analyzing genes from different views. Second, the platform integrates multiple approaches designed for the various types of data. We implement functions to analyze both individual genes and gene sets. We also compare AlzCode with two existing platforms for AD analysis, which are Agora and AD Atlas. We pinpoint the features of each platform and highlight their differences. This platform would be valuable to the understanding of AD genetics and pathological mechanisms. Availability and implementation: AlzCode is freely available at: http://www.alzcode.xyz. Contact: xqpeng@csu.edu.cn or jxwang@mail.csu.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online.
Identification of Alzheimer's Disease-Related Genes Based on Data Integration Method

Yang Hu,Tianyi Zhao,Tianyi Zang,Ying Zhang,Liang Cheng

DOI: https://doi.org/10.3389/fgene.2018.00703

IF: 3.7

2019-01-01

Frontiers in Genetics

Abstract:Alzheimer disease (AD) is the fourth major cause of death in the elderly following cancer, heart disease and cerebrovascular disease. Finding candidate causal genes can help in the design of Gene targeted drugs and effectively reduce the risk of the disease. Complex diseases such as AD are usually caused by multiple genes. The Genome-wide association study (GWAS), has identified the potential genetic variants for most diseases. However, because of linkage disequilibrium (LD), it is difficult to identify the causative mutations that directly cause diseases. In this study, we combined expression quantitative trait locus (eQTL) studies with the GWAS, to comprehensively define the genes that cause Alzheimer disease. The method used was the Summary Mendelian randomization (SMR), which is a novel method to integrate summarized data. Two GWAS studies and five eQTL studies were referenced in this paper. We found several candidate SNPs that have a strong relationship with AD. Most of these SNPs overlap in different data sets, providing relatively strong reliability. We also explain the function of the novel AD-related genes we have discovered.
Multi-omic Atlas of the Parahippocampal Gyrus in Alzheimer’s Disease

Claire Coleman,Minghui Wang,Erming Wang,Courtney Micallef,Zhiping Shao,James M. Vicari,Yuxin Li,Kaiwen Yu,Dongming Cai,Junmin Peng,Vahram Haroutunian,John F. Fullard,Jaroslav Bendl,Bin Zhang,Panos Roussos

DOI: https://doi.org/10.1038/s41597-023-02507-2

2023-01-01

Scientific Data

Abstract:Alzheimer’s disease (AD) is the most common form of dementia worldwide, with a projection of 151 million cases by 2050. Previous genetic studies have identified three main genes associated with early-onset familial Alzheimer’s disease, however this subtype accounts for less than 5% of total cases. Next-generation sequencing has been well established and holds great promise to assist in the development of novel therapeutics as well as biomarkers to prevent or slow the progression of this devastating disease. Here we present a public resource of functional genomic data from the parahippocampal gyrus of 201 postmortem control, mild cognitively impaired (MCI) and AD individuals from the Mount Sinai brain bank, of which whole-genome sequencing (WGS), and bulk RNA sequencing (RNA-seq) were previously published. The genomic data include bulk proteomics and DNA methylation, as well as cell-type-specific RNA-seq and assay for transposase-accessible chromatin with high-throughput sequencing (ATAC-seq) data. We have performed extensive preprocessing and quality control, allowing the research community to access and utilize this public resource available on the Synapse platform at https://doi.org/10.7303/syn51180043.2 .
Machine learning prediction and tau-based screening identifies potential Alzheimer’s disease genes relevant to immunity

Jessica Binder,Oleg Ursu,Cristian Bologa,Shanya Jiang,Nicole Maphis,Somayeh Dadras,Devon Chisholm,Jason Weick,Orrin Myers,Praveen Kumar,Jeremy J. Yang,Kiran Bhaskar,Tudor I. Oprea

DOI: https://doi.org/10.1038/s42003-022-03068-7

IF: 6.548

2022-02-11

Communications Biology

Abstract:Abstract With increased research funding for Alzheimer’s disease (AD) and related disorders across the globe, large amounts of data are being generated. Several studies employed machine learning methods to understand the ever-growing omics data to enhance early diagnosis, map complex disease networks, or uncover potential drug targets. We describe results based on a Target Central Resource Database protein knowledge graph and evidence paths transformed into vectors by metapath matching. We extracted features between specific genes and diseases, then trained and optimized our model using XGBoost, termed MPxgb(AD). To determine our MPxgb(AD) prediction performance, we examined the top twenty predicted genes through an experimental screening pipeline. Our analysis identified potential AD risk genes: FRRS1, CTRAM, SCGB3A1, FAM92B/CIBAR2 , and TMEFF2 . FRRS1 and FAM92B are considered dark genes, while CTRAM , SCGB3A1 , and TMEFF2 are connected to TREM2-TYROBP, IL-1β-TNFα, and MTOR-APP AD-risk nodes, suggesting relevance to the pathogenesis of AD.

biology
AlzCode: a platform for multiview analysis of genes related to Alzheimer’s disease

Cui-Xiang Lin,Hong-Dong Li,Chao Deng,Shannon Erhardt,Jun Wang,Xiaoqing Peng,Jianxin Wang

DOI: https://doi.org/10.1093/bioinformatics/btac033

IF: 5.8

2022-01-18

Bioinformatics

Abstract:Abstract Motivation Alzheimer’s disease (AD) is a complex brain disorder with risk genes incompletely identified. The candidate genes are dominantly obtained by computational approaches. In order to obtain biological insights of candidate genes or screen genes for experimental testing, it is essential to assess their relevance to AD. A platform that integrates different types of omics data and approaches would facilitate the analysis of candidate genes and is in great need. Results We report AlzCode, a platform for multiview analysis of genes related to AD. First, this platform integrates a rich collection of functional genomic data, including expression data of AD samples (gene expression, single-cell RNA-seq data and protein expression), AD-specific biological networks (co-expression networks and functional gene networks), neuropathological and clinical traits (CERAD score, Braak staging score, Clinical Dementia Rating, cognitive function and clinical severity) and general data such as protein–protein interaction, regulatory networks, sequence similarity and miRNA-target interactions. These data provide basis for analyzing genes from different views. Second, the platform integrates multiple approaches designed for the various types of data. We implement functions to analyze both individual genes and gene sets. We also compare AlzCode with two existing platforms for AD analysis, which are Agora and AD Atlas. We pinpoint the features of each platform and highlight their differences. This platform would be valuable to the understanding of AD genetics and pathological mechanisms. Availability and implementation AlzCode is freely available at: http://www.alzcode.xyz. Supplementary information Supplementary data are available at Bioinformatics online.

biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
Identification of Blood-Based Glycolysis Gene Associated with Alzheimer's Disease by Integrated Bioinformatics Analysis

Fang Wang,Chun-Shuang Xu,Wei-Hua Chen,Shi-Wei Duan,Shu-Jun Xu,Jun-Jie Dai,Qin-Wen Wang

DOI: https://doi.org/10.3233/JAD-210540

Abstract:Background: Alzheimer's disease (AD) is one of many common neurodegenerative diseases without ideal treatment, but early detection and intervention can prevent the disease progression. Objective: This study aimed to identify AD-related glycolysis gene for AD diagnosis and further investigation by integrated bioinformatics analysis. Methods: 122 subjects were recruited from the affiliated hospitals of Ningbo University between 1 October 2015 and 31 December 2016. Their clinical information and methylation levels of 8 glycolysis genes were assessed. Machine learning algorithms were used to establish an AD prediction model. Receiver operating characteristic curve (AUC) and decision curve analysis (DCA) were used to assess the model. An AD risk factor model was developed by SHapley Additive exPlanations (SHAP) to extract features that had important impacts on AD. Finally, gene expression of AD-related glycolysis genes were validated by AlzData. Results: An AD prediction model was developed using random forest algorithm with the best average ROC_AUC (0.969544). The threshold probability of the model was positive in the range of 0∼0.9875 by DCA. Eight glycolysis genes (GAPDHS, PKLR, PFKFB3, LDHC, DLD, ALDOC, LDHB, HK3) were identified by SHAP. Five of these genes (PFKFB3, DLD, ALDOC, LDHB, LDHC) have significant differences in gene expression between AD and control groups by Alzdata, while three of the genes (HK3, ALDOC, PKLR) are related to the pathogenesis of AD. GAPDHS is involved in the regulatory network of AD risk genes. Conclusion: We identified 8 AD-related glycolysis genes (GAPDHS, PFKFB3, LDHC, HK3, ALDOC, LDHB, PKLR, DLD) as promising candidate biomarkers for early diagnosis of AD by integrated bioinformatics analysis. Machine learning has the advantage in identifying genes.
AlzBase: an Integrative Database for Gene Dysregulation in Alzheimer's Disease

Zhouxian Bai,Guangchun Han,Bin Xie,Jiajia Wang,Fuhai Song,Xing Peng,Hongxing Lei

DOI: https://doi.org/10.1007/s12035-014-9011-3

Abstract:Alzheimer's disease (AD) affects a significant portion of elderly people worldwide. Although the amyloid-β (Aβ) cascade hypothesis has been the prevailing theory for the molecular mechanism of AD in the past few decades, treatment strategies targeting the Aβ cascade have not demonstrated effectiveness as yet. Thus, elucidating the spatial and temporal evolution of the molecular pathways in AD remains to be a daunting task. To facilitate novel discoveries in this filed, here, we have integrated information from multiple sources for the better understanding of gene functions in AD pathogenesis. Several categories of information have been collected, including (1) gene dysregulation in AD and closely related processes/diseases such as aging and neurological disorders, (2) correlation of gene dysregulation with AD severity, (3) a wealth of annotations on the functional and regulatory information, and (4) network connections for gene-gene relationship. In addition, we have also provided a comprehensive summary for the top ranked genes in AlzBase. By evaluating the information curated in AlzBase, researchers can prioritize genes from their own research and generate novel hypothesis regarding the molecular mechanism of AD. To demonstrate the utility of AlzBase, we examined the genes from the genetic studies of AD. It revealed links between the upstream genetic variations and downstream endo-phenotype and suggested several genes with higher priority. This integrative database is freely available on the web at http://alz.big.ac.cn/alzBase .
Additional File 8 of the Fusiform Gyrus Exhibits an Epigenetic Signature for Alzheimer’s Disease

Dingailu Ma,Irfete S. Fetahu,Mei Wang,Rui Fang,Jiahui Li,Huan Liu,Tobin Gramyk,Isabella Iwanicki,Sophie Gu,Winnie Xu,Li Tan,Feizhen Wu,Yujiang G. Shi

DOI: https://doi.org/10.6084/m9.figshare.12885739

2020-01-01

Abstract:Additional file 8: Table S7. ST7_validated_genes_function_annotation.xlsx.
Additional File 3 of the Fusiform Gyrus Exhibits an Epigenetic Signature for Alzheimer’s Disease

Dingailu Ma,Irfete S. Fetahu,Mei Wang,Rui Fang,Jiahui Li,Huan Liu,Tobin Gramyk,Isabella Iwanicki,Sophie Gu,Winnie Xu,Li Tan,Feizhen Wu,Yujiang G. Shi

DOI: https://doi.org/10.6084/m9.figshare.12885724

2020-01-01

Abstract:Additional file 3: Table S2. ST2_Top_DEG_in_heatmap_AD-risk-factor_TF.xlsx.

Additional File 2: of Revealing Alzheimer’s Disease Genes Spectrum in the Whole-Genome by Machine Learning

Additional File 4: of Revealing Alzheimer’s Disease Genes Spectrum in the Whole-Genome by Machine Learning

Revealing Alzheimer’s Disease Genes Spectrum in the Whole-Genome by Machine Learning

Identifying Shared Diagnostic Genes and Mechanisms in Vascular Dementia and Alzheimer's Disease Via Bioinformatics and Machine Learning

Human Whole Genome Genotype and Transcriptome Data for Alzheimer’s and Other Neurodegenerative Diseases

Identification of feature genes and pathways for Alzheimer's disease via WGCNA and LASSO regression

An Alzheimers Disease Related Genes Identification Method Based on Multiple Classifier Integration

Discovering Genetic Signatures Associated with Alzheimer's Disease in Tiled Whole Genome Sequence Data: Results from the Artificial Intelligence for Alzheimer's Disease (AI4AD) Consortium

Additional File 5 of Spatially Resolved Transcriptomics Reveals Genes Associated with the Vulnerability of Middle Temporal Gyrus in Alzheimer’s Disease

Alzheimer's Disease Sequencing Project Release 4 Whole Genome Sequencing Dataset

The Mount Sinai Cohort of Large-Scale Genomic, Transcriptomic and Proteomic Data in Alzheimer's Disease

Identification of immune microenvironment subtypes and signature genes for Alzheimer's disease diagnosis and risk prediction based on explainable machine learning

AlzCode: a Platform for Multiview Analysis of Genes Related to Alzheimer's Disease.

Identification of Alzheimer's Disease-Related Genes Based on Data Integration Method

Multi-omic Atlas of the Parahippocampal Gyrus in Alzheimer’s Disease

Machine learning prediction and tau-based screening identifies potential Alzheimer’s disease genes relevant to immunity

AlzCode: a platform for multiview analysis of genes related to Alzheimer’s disease

Identification of Blood-Based Glycolysis Gene Associated with Alzheimer's Disease by Integrated Bioinformatics Analysis

AlzBase: an Integrative Database for Gene Dysregulation in Alzheimer's Disease

Additional File 8 of the Fusiform Gyrus Exhibits an Epigenetic Signature for Alzheimer’s Disease

Additional File 3 of the Fusiform Gyrus Exhibits an Epigenetic Signature for Alzheimer’s Disease