Client-Adaptive Cross-Model Reconstruction Network for Modality-Incomplete Multimodal Federated Learning

Baochen Xiong,Changsheng Xu,Yaowei Wang,Xiaoshan Yang,Y. Song
DOI: https://doi.org/10.1145/3581783.3611757
2023-10-26
Abstract:Multimodal federated learning (MFL) is an emerging field that allows many distributed clients, each with multimodal data, to work together to train models targeting multimodal tasks without sharing local data. Whereas, existing methods assume that all modalities for each sample are complete, which limits their practicality. In this paper, we propose a Client-Adaptive Cross-Modal Reconstruction Network (CACMRN) to solve the modality-incomplete multimodal federated learning (MI-MFL). Compared to existing centralized methods for reconstructing missing modality, the local client data in federated learning is typically much less, which makes it challenging to train a reliable reconstruction model that can accurately predict missing data. We propose a cross-modal reconstruction transformer, which can prevent the model overfitting on the local client by exploring instance-instance relationships within the local client and utilizing normalized self-attention to conduct data-depended partial updating. Using federated optimization with alternative local updating and global aggregation, our method can not only collaboratively utilize the distributed data on different local clients to learn the cross-modal reconstruction transformer, but also prevent the reconstruction model from overfitting the data on the local client. Extensive experimental results on three datasets demonstrate the effectiveness of our method.
Computer Science
What problem does this paper attempt to address?