Cross-Modal Prototype based Multimodal Federated Learning under Severely Missing Modality

Huy Q. Le,Chu Myaet Thwal,Yu Qiao,Ye Lin Tun,Minh N. H. Nguyen,Choong Seon Hong

2024-01-25

Abstract:Multimodal federated learning (MFL) has emerged as a decentralized machine learning paradigm, allowing multiple clients with different modalities to collaborate on training a machine learning model across diverse data sources without sharing their private data. However, challenges, such as data heterogeneity and severely missing modalities, pose crucial hindrances to the robustness of MFL, significantly impacting the performance of global model. The absence of a modality introduces misalignment during the local training phase, stemming from zero-filling in the case of clients with missing modalities. Consequently, achieving robust generalization in global model becomes imperative, especially when dealing with clients that have incomplete data. In this paper, we propose Multimodal Federated Cross Prototype Learning (MFCPL), a novel approach for MFL under severely missing modalities by conducting the complete prototypes to provide diverse modality knowledge in modality-shared level with the cross-modal regularization and modality-specific level with cross-modal contrastive mechanism. Additionally, our approach introduces the cross-modal alignment to provide regularization for modality-specific features, thereby enhancing overall performance, particularly in scenarios involving severely missing modalities. Through extensive experiments on three multimodal datasets, we demonstrate the effectiveness of MFCPL in mitigating these challenges and improving the overall performance.

Machine Learning

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to solve two main challenges in multimodal federated learning (MFL): 1. **Data Heterogeneity**: In MFL, the data distributions of different clients may vary greatly, which leads to difficulties in model training. This heterogeneity will cause the performance of the global model to decline. 2. **Severely Missing Modalities**: In practical applications, some clients may not be able to provide data for all modalities (for example, due to sensor failures, different hardware platforms or operational problems). This situation of missing modalities will further affect the robustness and performance of the model. To solve these problems, the authors propose a multimodal federated cross - prototype learning framework (MFCPL). MFCPL provides diverse modality knowledge by introducing complete prototypes and uses three components, namely cross - modal prototypes regularization (CMPR), cross - modal prototypes contrastive (CMPC) and cross - modal alignment (CMA), to enhance the robustness and generalization ability of the model. Specifically, the main contributions of MFCPL include: - **Introducing complete prototypes in heterogeneous MFL for the first time** to address the challenge of missing modalities. - **Proposing a novel multimodal federated learning framework, MFCPL**, which learns the global model through two levels of complete prototypes (modality - specific representations and modality - shared representations). - **Verifying the effectiveness of MFCPL through extensive experiments** and demonstrating its superiority on three multimodal federated datasets. These methods work together to enable MFCPL to maintain high model performance and robustness even when dealing with severely missing modalities.

Cross-Modal Prototype based Multimodal Federated Learning under Severely Missing Modality

Multimodal Federated Learning with Missing Modality via Prototype Mask and Contrast

Client-Adaptive Cross-Model Reconstruction Network for Modality-Incomplete Multimodal Federated Learning

FedMAC: Tackling Partial-Modality Missing in Federated Learning with Cross-Modal Aggregation and Contrastive Regularization

FedMMR: Multi-Modal Federated Learning Via Missing Modality Reconstruction

Communication-Efficient Multimodal Federated Learning: Joint Modality and Client Selection

Leveraging Foundation Models for Multi-modal Federated Learning with Incomplete Modality

FedMFS: Federated Multimodal Fusion Learning with Selective Modality Communication

Resource-Efficient Federated Multimodal Learning via Layer-wise and Progressive Training

FedMEKT: Distillation-based Embedding Knowledge Transfer for Multimodal Federated Learning

Balanced Multi-modal Federated Learning via Cross-Modal Infiltration

CAR-MFL: Cross-Modal Augmentation by Retrieval for Multimodal Federated Learning with Missing Modalities

Unimodal Training-Multimodal Prediction: Cross-modal Federated Learning with Hierarchical Aggregation

Open-Vocabulary Federated Learning with Multimodal Prototyping

Prioritizing Modalities: Flexible Importance Scheduling in Federated Multimodal Learning

3FM: Multi-modal Meta-learning for Federated Tasks

Overcome Modal Bias in Multi-modal Federated Learning via Balanced Modality Selection

A unified framework for multi-modal federated learning

Multimodal Federated Learning: A Survey

FedMultimodal: A Benchmark For Multimodal Federated Learning