A Multimodal Foundation Model for Discovering Genetic Associations with Brain Imaging Phenotypes

Diego Machado Reyes,Myson C Burch,LAXMI PARIDA,Aritra Bose
DOI: https://doi.org/10.1101/2024.11.02.24316653
2024-11-04
Abstract:Due to the intricate etiology of neurological disorders, finding interpretable associations between multi-omics features can be challenging using standard approaches. We propose COMICAL, a contrastive learning approach leveraging multi-omics data to generate associations between genetic markers and brain imaging-derived phenotypes. COMICAL jointly learns omic representations utilizing transformer-based encoders with custom tokenizers. Our modality-agnostic approach uniquely identifies many-to-many associations via self-supervised learning schemes and cross-modal attention encoders. COMICAL discovered several significant associations between genetic markers and imaging-derived phenotypes for a variety of neurological disorders in the UK Biobank as well as predicting across diseases and unseen clinical outcomes from the learned representations. Source code of COMICAL along with pre-trained weights, enabling transfer learning is available at https://github.com/IBM/comical
Genetic and Genomic Medicine
What problem does this paper attempt to address?