Abstract:Selecting proper clients to participate in each federated learning (FL) round is critical to effectively harness a broad range of distributed data. Existing client selection methods simply consider the mining of distributed uni-modal data, yet, their effectiveness may diminish in multi-modal FL (MFL) as the modality imbalance problem not only impedes the collaborative local training but also leads to a severe global modality-level bias. We empirically reveal that local training with a certain single modality may contribute more to the global model than training with all local modalities. To effectively exploit the distributed multiple modalities, we propose a novel Balanced Modality Selection framework for MFL (BMSFed) to overcome the modal bias. On the one hand, we introduce a modal enhancement loss during local training to alleviate local imbalance based on the aggregated global prototypes. On the other hand, we propose the modality selection aiming to select subsets of local modalities with great diversity and achieving global modal balance simultaneously. Our extensive experiments on audio-visual, colored-gray, and front-back datasets showcase the superiority of BMSFed over baselines and its effectiveness in multi-modal data exploitation.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is in multi - modal federated learning (MFL), the problems of blocked collaborative local training and serious global modal - level deviation caused by modality imbalance. Specifically, the existing client selection methods are not effective when dealing with clients with multi - modal data because they ignore the inter - modal interaction during multi - modal joint training. This modality imbalance not only hinders collaborative local training but also leads to serious global model modal - level deviation. The paper discovers through experiments that unimodal training on some clients may contribute more to the global model than multi - modal training. Based on this observation, the authors propose a novel balanced modality selection framework (Balanced Modality Selection framework for MFL, BMSFed), aiming to overcome modality deviation and fully utilize distributed multi - modal data. ### Main contributions: 1. **Empirical analysis**: The authors reveal the modality imbalance problem in multi - modal federated learning through empirical analysis and point out that unimodal training on some clients may contribute more to the global model than multi - modal training. 2. **Balanced modality selection scheme**: Based on the above analysis, a new balanced modality selection scheme (BMSFed) is proposed to overcome global modality deviation by introducing modal enhancement loss and representative modality selection. 3. **Experimental verification**: Comprehensive experiments were carried out on multiple datasets such as audio - visual, color - grayscale and front - rear view to verify the effectiveness and superiority of BMSFed. ### Method overview: 1. **Local imbalance mitigation**: Adjust local training by introducing a global - prototype - based modal enhancement loss (ME loss) to alleviate local modality imbalance. 2. **Balanced modality selection**: Select multi - modal clients and unimodal clients respectively by constructing two separate sub - modular functions to ensure global modality balance. ### Experimental results: - **Performance improvement**: BMSFed outperforms the baseline methods on multiple datasets, especially in IID and non - IID settings. - **Modality balance**: BMSFed can significantly improve the performance of weak modalities (such as vision), reduce modality - level deviation, while maintaining the performance of strong modalities (such as audio). In conclusion, this paper effectively solves the modality imbalance problem in multi - modal federated learning by proposing the BMSFed framework, and improves the performance and modality balance of the global model.

Overcome Modal Bias in Multi-modal Federated Learning via Balanced Modality Selection

Communication-Efficient Multimodal Federated Learning: Joint Modality and Client Selection

Balanced Multi-modal Federated Learning via Cross-Modal Infiltration

FedMFS: Federated Multimodal Fusion Learning with Selective Modality Communication

Leveraging Foundation Models for Multi-modal Federated Learning with Incomplete Modality

Prioritizing Modalities: Flexible Importance Scheduling in Federated Multimodal Learning

A unified framework for multi-modal federated learning

FedMLLM: Federated Fine-tuning MLLM on Multimodal Heterogeneity Data

FedMultimodal: A Benchmark For Multimodal Federated Learning

Multimodal Federated Learning: A Survey

3FM: Multi-modal Meta-learning for Federated Tasks

Multimodal Federated Learning with Missing Modality via Prototype Mask and Contrast

Modality Alignment Meets Federated Broadcasting

Federated Modality-specific Encoders and Multimodal Anchors for Personalized Brain Tumor Segmentation

Cross-Modal Prototype based Multimodal Federated Learning under Severely Missing Modality

Multimodal Federated Learning

Unimodal Training-Multimodal Prediction: Cross-modal Federated Learning with Hierarchical Aggregation

Open-Vocabulary Federated Learning with Multimodal Prototyping