ContIG: Self-supervised Multimodal Contrastive Learning for Medical Imaging with Genetics

Aiham Taleb,Matthias Kirchler,Remo Monti,Christoph Lippert
DOI: https://doi.org/10.48550/arXiv.2111.13424
2021-11-26
Abstract:High annotation costs are a substantial bottleneck in applying modern deep learning architectures to clinically relevant medical use cases, substantiating the need for novel algorithms to learn from unlabeled data. In this work, we propose ContIG, a self-supervised method that can learn from large datasets of unlabeled medical images and genetic data. Our approach aligns images and several genetic modalities in the feature space using a contrastive loss. We design our method to integrate multiple modalities of each individual person in the same model end-to-end, even when the available modalities vary across individuals. Our procedure outperforms state-of-the-art self-supervised methods on all evaluated downstream benchmark tasks. We also adapt gradient-based explainability algorithms to better understand the learned cross-modal associations between the images and genetic modalities. Finally, we perform genome-wide association studies on the features learned by our models, uncovering interesting relationships between images and genetic data.
Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in medical imaging and genetic data, the high cost of high - quality labeled data restricts the use of modern deep - learning architectures in clinically relevant medical applications. Specifically, the authors propose a self - supervised method named ContIG, aiming to learn from a large number of unlabeled medical images and genetic data sets. This method aligns data of different modalities (such as images and multiple genetic modalities) through a contrastive loss function, enabling end - to - end integration of data of multiple modalities in the same model, even if these modalities vary among different individuals. In addition, this method can better understand the learned cross - modal associations through an adaptive gradient interpretation algorithm, and reveal interesting relationships between images and genetic data through genome - wide association studies of the features learned by the model. In short, the core problem of the paper is to develop an effective self - supervised learning method to reduce the dependence on expensive manually - labeled data while improving the comprehensive analysis ability of medical images and genetic data. This not only helps to reduce the cost of medical applications but also promotes in - depth understanding of disease mechanisms.