Scmoe: Single-Cell Multi-Modal Multi-Task Learning Via Sparse Mixture-of-Experts
Sukwon Yun,Jie Peng,Namkyeong Lee,Yanyong Zhang,Chanyoung Park,Zunpeng Liu,Tianlong Chen
DOI: https://doi.org/10.1101/2024.11.12.623336
2024-01-01
Abstract:Recent advances in measuring high-dimensional modalities, including protein levels and DNA accessibility, at the single-cell level have prompted the need for frameworks capable of handling multi-modal data while simultaneously addressing multiple tasks. Despite these ad- vancements, much of the work in the single-cell domain remains limited, often focusing on either a single-modal or single-task perspective. A few recent studies have ventured into multi- modal, multi-task learning, but we identified a (1) OptimizationConflict issue, leading to suboptimal results when integrating additional modalities, which is undesirable. Furthermore, there is a (2) Costly Interpretability challenge, as current approaches predominantly rely on costly post-hoc methods like SHAP. Motivated by these challenges, we introduce scMoE1, a novel framework that, for the first time, applies Sparse Mixture-of-Experts (SMoE) within the single-cell domain. This is achieved by incorporating an SMoE layer into a transformer block with a cross-attention module. Thanks to its design, scMoE inherently possesses mechanistic interpretability, a critical aspect for understanding underlying mechanisms when handling biological data. Furthermore, from a post-hoc perspective, we enhance interpretability by extending the concept of activation vectors (CAVs). Extensive experiments on simulated datasets, such as Dyngen, and real-world multi-modal single-cell datasets, including {DBiT-seq, Patch-seq, ATAC-seq}, demonstrate the effectiveness of scMoE. Source code of scMoE is available at: https: //github.com/UNITES-Lab/scMoE. ### Competing Interest Statement The authors have declared no competing interest.