Multimodal weakly supervised learning to identify disease-specific changes in single-cell atlases

Anastasia Litinetskaya,Maiia Shulman,Soroor Hediyeh-zadeh,Amir Ali Moinfar,Fabiola Curion,Artur Szalata,Alireza Omidi,Mohammad Lotfollahi,Fabian J. Theis
DOI: https://doi.org/10.1101/2024.07.29.605625
2024-01-01
Abstract:Multimodal analysis of single-cell samples from healthy and diseased tissues at various stages provides a comprehensive view that identifies disease-specific cells, their molecular features and aids in patient stratification. Here, we present MultiMIL, a novel weakly-supervised multimodal model designed to construct multimodal single-cell references and prioritize phenotype-specific cells via patient classification. MultiMIL effectively integrates single-cell modalities, even when they only partially overlap, providing robust representations for downstream analyses such as phenotypic prediction and cell prioritization. Using a multiple-instance learning approach, MultiMIL aggregates cell-level measurements into sample-level representations and identifies disease-specific cell states through attention-based scoring. We demonstrate that MultiMIL accurately identifies disease-specific cell states in blood and lung samples, identifying novel disease-associated genes and achieving superior patient classification accuracy compared to existing methods. We anticipate MultiMIL will become an essential tool for querying single-cell multiomic atlases, enhancing our understanding of disease mechanisms and informing targeted treatments. ### Competing Interest Statement M.L. consults Santa Anna Bio, owns interests in Relation Therapeutics, and is a scientific co-founder and part-time employee at AIVIVO. F.J.T. consults for Immunai Inc., Singularity Bio B.V., CytoReason Ltd, Cellarity, Curie Bio Operations, LLC and has an ownership interest in Dermagnostix GmbH and Cellarity.
What problem does this paper attempt to address?