Sparse parallel independent component analysis and its application to identify linked genomic and gray matter alterations underlying working memory impairment in attention-deficit/hyperactivity disorder

Kuaikuai Duan,Jiayu Chen,Vince D. Calhoun,Wenhao Jiang,Kelly Rootes-Murdy,Gido Schoenmacker,Rogers F. Silva,Barbara Franke,Jan K. Buitelaar,Martine Hoogman,Jaap Oosterlaan,Pieter J Hoekstra,Dirk Heslenfeld,Catharina A Hartman,Emma Sprooten,Alejandro Arias-Vasquez,Jessica A. Turner,Jingyu Liu
DOI: https://doi.org/10.1101/2020.07.11.198622
2020-07-12
Abstract:Abstract Most psychiatric disorders are highly heritable and associated with altered brain structural and functional patterns. Data fusion analyses on brain imaging and genetics, one of which is parallel independent component analysis (pICA), enable the link of genomic factors to brain patterns. Due to the small to modest effect sizes of common genetic variants in psychiatric disorders, it is usually challenging to reliably separate disorder-related genetic factors from the rest of the genome with the typical size of clinical samples. To alleviate this problem, we propose sparse parallel independent component analysis (spICA) to leverage the sparsity of individual genomic sources. The sparsity is enforced by performing Hoyer projection on the estimated independent sources. Simulation results demonstrate that the proposed spICA yields improved detection of independent sources and imaging-genomic associations compared to pICA. We applied spICA to gray matter volume (GMV) and single nucleotide polymorphism (SNP) data of 341 unrelated adults, including 127 controls, 167 attention-deficit/hyperactivity disorder (ADHD) cases, and 47 unaffected siblings. We identified one SNP source significantly and positively associated with a GMV source in superior/middle frontal regions. This association was replicated with a smaller effect size in 317 adolescents from ADHD families, including 188 individuals with ADHD and 129 unaffected siblings. The association was found to be more significant in ADHD families than controls, and stronger in adults and older adolescents than younger ones. The identified GMV source in superior/middle frontal regions was not correlated with head motion parameters and its loadings (expression levels) were reduced in adolescent (but not adult) individuals with ADHD. This GMV source was associated with working memory deficits in both adult and adolescent individuals with ADHD. The identified SNP component highlights SNPs in genes encoding long non-coding RNAs and SNPs in genes MEF2C, CADM2, and CADPS2, which have known functions relevant for modulating neuronal substrates underlying high-level cognition in ADHD.
What problem does this paper attempt to address?