Multimodal Federated Learning with Missing Modality via Prototype Mask and Contrast

Guangyin Bao,Qi Zhang,Duoqian Miao,Zixuan Gong,Liang Hu,Ke Liu,Yang Liu,Chongyang Shi

2024-02-04

Abstract:In real-world scenarios, multimodal federated learning often faces the practical challenge of intricate modality missing, which poses constraints on building federated frameworks and significantly degrades model inference accuracy. Existing solutions for addressing missing modalities generally involve developing modality-specific encoders on clients and training modality fusion modules on servers. However, these methods are primarily constrained to specific scenarios with either unimodal clients or complete multimodal clients, struggling to generalize effectively in the intricate modality missing scenarios. In this paper, we introduce a prototype library into the FedAvg-based Federated Learning framework, thereby empowering the framework with the capability to alleviate the global model performance degradation resulting from modality missing during both training and testing. The proposed method utilizes prototypes as masks representing missing modalities to formulate a task-calibrated training loss and a model-agnostic uni-modality inference strategy. In addition, a proximal term based on prototypes is constructed to enhance local training. Experimental results demonstrate the state-of-the-art performance of our approach. Compared to the baselines, our method improved inference accuracy by 3.7\% with 50\% modality missing during training and by 23.8\% during uni-modality inference. Code is available at <a class="link-external link-https" href="https://github.com/BaoGuangYin/PmcmFL" rel="external noopener nofollow">this https URL</a>.

Machine Learning,Artificial Intelligence,Distributed, Parallel, and Cluster Computing

What problem does this paper attempt to address?

The paper aims to address the prevalent issue of modality missing in multimodal federated learning (mFL). Specifically, existing mFL methods are mainly limited to specific scenarios of single-modal clients or complete multimodal clients, and there is severe task drift between clients and servers, making it difficult to generalize effectively in complex modality missing situations. Therefore, this paper proposes a new Prototype Mask and Contrast (PmcmFL) framework, aiming to solve the problem through the following points: 1. **Handling Complex Modality Missing**: Introduces a prototype library to compensate for cross-modal fusion and correct task drift, thereby alleviating performance degradation due to modality missing during training and inference. 2. **Avoiding Task Drift**: Utilizes prototypes as global prior knowledge to compensate for cross-modal fusion when modalities are missing and calibrate task drift. 3. **Improving Inference Accuracy**: During inference, uses prototypes as masks for missing modalities and finds the closest semantic prototype through different matching algorithms. Through these innovative methods, PmcmFL can effectively handle complex modality missing situations during training and inference, significantly improving model performance. Experimental results show that PmcmFL outperforms existing baseline methods under different modality missing rates.

Multimodal Federated Learning with Missing Modality via Prototype Mask and Contrast

Cross-Modal Prototype based Multimodal Federated Learning under Severely Missing Modality

Communication-Efficient Multimodal Federated Learning: Joint Modality and Client Selection

Leveraging Foundation Models for Multi-modal Federated Learning with Incomplete Modality

FedMAC: Tackling Partial-Modality Missing in Federated Learning with Cross-Modal Aggregation and Contrastive Regularization

FedMFS: Federated Multimodal Fusion Learning with Selective Modality Communication

Open-Vocabulary Federated Learning with Multimodal Prototyping

Resource-Efficient Federated Multimodal Learning via Layer-wise and Progressive Training

Federated Pseudo Modality Generation for Incomplete Multi-Modal MRI Reconstruction

FedMultimodal: A Benchmark For Multimodal Federated Learning

Multimodal Federated Learning

A unified framework for multi-modal federated learning

3FM: Multi-modal Meta-learning for Federated Tasks

What Makes for Robust Multi-Modal Models in the Face of Missing Modalities?

Overcome Modal Bias in Multi-modal Federated Learning via Balanced Modality Selection

Federated Modality-specific Encoders and Multimodal Anchors for Personalized Brain Tumor Segmentation

FedMLLM: Federated Fine-tuning MLLM on Multimodal Heterogeneity Data

Toward Robust Multimodal Learning using Multimodal Foundational Models