S-BEAM: A Semi-Supervised Ensemble Approach to Rank Potential Causal Variants and Their Target Genes in Microglia for Alzheimer’s Disease

Archita Khaire,Jia Wen,Xiaoyu Yang,Haibo Zhou,Yin Shen,Yun Li
DOI: https://doi.org/10.1101/2022.11.01.514771
2023-01-01
Abstract:Alzheimer’s disease (AD) is the leading cause of death among individuals over 65. Despite many AD genetic variants detected by large genome-wide association studies (GWAS), a limited number of causal genes have been confirmed. Conventional machine learning techniques integrate functional annotation data and GWAS signals to assign variants functional relevance probabilities. Yet, a large proportion of genetic variation lies in the non-coding genome, where unsupervised and semi-supervised techniques have demonstrated greater advantage. Furthermore, cell-type specific approaches are needed to better understand disease etiology. Studying AD from a microglia-specific lens is more likely to reveal causal variants involved in immune pathways. Therefore, in this study, we developed S-BEAM: a semi-supervised ensemble approach using microglia-specific data to prioritize non-coding variants and their target genes that play roles in immune-related AD mechanisms. We designed a transductive positive-unlabeled and negative-unlabeled learning model that employs a bagging technique to learn from unlabeled variants, generating multiple predicted probabilities of variant risk. Using a combined homogeneous-heterogeneous ensemble framework, we aggregated the predictions. We applied our model to AD variant data, identifying 11 risk variants acting in well-known AD genes, such as TSPAN14 , INPP5D , and MS4A2 . These results validated our model’s performance and demonstrated a need to study these genes in the context of microglial pathways. We also proposed further experimental study for 37 potential causal variants associated with less-known genes. Our work has utility in predicting AD relevant genes and variants functioning in microglia and can be generalized for application to other complex diseases or cell types. ### Competing Interest Statement The authors have declared no competing interest.
What problem does this paper attempt to address?