A novel classification framework for genome-wide association study of whole brain MRI images using deep learning

Shaojun Yu,Junjie Wu,Yumeng Shao,Deqiang Qiu,Zhaohui S. Qin,for the Alzheimer's Disease Neuroimaging Initiative
DOI: https://doi.org/10.1371/journal.pcbi.1012527
2024-10-16
PLoS Computational Biology
Abstract:Genome-wide association studies (GWASs) have been widely applied in the neuroimaging field to discover genetic variants associated with brain-related traits. So far, almost all GWASs conducted in neuroimaging genetics are performed on univariate quantitative features summarized from brain images. On the other hand, powerful deep learning technologies have dramatically improved our ability to classify images. In this study, we proposed and implemented a novel machine learning strategy for systematically identifying genetic variants that lead to detectable nuances on Magnetic Resonance Images (MRI). For a specific single nucleotide polymorphism (SNP), if MRI images labeled by genotypes of this SNP can be reliably distinguished using machine learning, we then hypothesized that this SNP is likely to be associated with brain anatomy or function which is manifested in MRI brain images. We applied this strategy to a catalog of MRI image and genotype data collected by the Alzheimer's Disease Neuroimaging Initiative (ADNI) consortium. From the results, we identified novel variants that show strong association to brain phenotypes. Genome-wide association study (GWAS) is a powerful method to identify associations between genetic variants and traits such as height, weight and disease status. When applying to Magnetic resonance imaging (MRI) data, traditional GWAS methods often rely on simplified summaries of brain imaging data, potentially missing subtle but significant global patterns. We proposed and implemented a different strategy: training a machine learning model to distinguish MR images based on genetic variants. If MRI images labeled by the mutation status of a variant can be reliably distinguished using machine learning, we then hypothesized that this variant is likely to be associated with brain anatomy or function which is manifested in MRI brain images. By applying this method to data collected from the Alzheimer's Disease Neuroimaging Initiative (ADNI), we found new genetic variants highly likely to affect brain phenotypes. This innovative approach not only handles high-dimensional imaging data more effectively but also captures complex, non-linear relationships between genetic variants and various brain traits, offering a fresh perspective on neuroimaging genetics.
biochemical research methods,mathematical & computational biology
What problem does this paper attempt to address?