Exploring Imaging Genetic Markers of Alzheimer's Disease Based on a Novel Nonlinear Correlation Analysis Algorithm

Renbo Yang,Wei Kong,Kun Liu,Gen Wen,Yaling Yu
DOI: https://doi.org/10.1007/s12031-024-02190-x
2024-04-03
Abstract:Alzheimer's disease (AD) is an irreversible neurological disorder characterized by insidious onset. Identifying potential markers in its emergence and progression is crucial for early diagnosis and treatment. Imaging genetics typically merges genetic variables with multiple imaging parameters, employing various association analysis algorithms to investigate the links between pathological phenotypes and genetic variations, and to unearth molecular-level insights from brain images. However, most existing imaging genetics algorithms based on sparse learning assume a linear relationship between genetic factors and brain functions, limiting their ability to discern complex nonlinear correlation patterns and resulting in reduced accuracy. To address these issues, we propose a novel nonlinear imaging genetic association analysis method, Deep Self-Reconstruction-based Adaptive Sparse Multi-view Deep Generalized Canonical Correlation Analysis (DSR-AdaSMDGCCA). This approach facilitates joint learning of the nonlinear relationships between pathological phenotypes and genetic variations by integrating three different types of data: structural magnetic resonance imaging (sMRI), single-nucleotide polymorphism (SNP), and gene expression data. By incorporating nonlinear transformations in DGCCA, our model effectively uncovers nonlinear associations across multiple data types. Additionally, the DSR algorithm clusters samples with identical labels, incorporating label information into the nonlinear feature extraction process and thus enhancing the performance of association analysis. The application of the DSR-AdaSMDGCCA algorithm on real data sets identified several AD risk regions (such as the hippocampus, parahippocampus, and fusiform gyrus) and risk genes (including VSIG4, NEDD4L, and PINK1), achieving maximum classification accuracy with the fewest selected features compared to baseline algorithms. Molecular biology enrichment analysis revealed that the pathways enriched by these top genes are intimately linked to AD progression, affirming that our algorithm not only improves correlation analysis performance but also identifies biologically significant markers.
What problem does this paper attempt to address?