Balanced Multi-modal Federated Learning via Cross-Modal Infiltration

Yunfeng Fan,Wenchao Xu,Haozhao Wang,Jiaqi Zhu,Song Guo
2023-12-31
Abstract:Federated learning (FL) underpins advancements in privacy-preserving distributed computing by collaboratively training neural networks without exposing clients' raw data. Current FL paradigms primarily focus on uni-modal data, while exploiting the knowledge from distributed multimodal data remains largely unexplored. Existing multimodal FL (MFL) solutions are mainly designed for statistical or modality heterogeneity from the input side, however, have yet to solve the fundamental issue,"modality imbalance", in distributed conditions, which can lead to inadequate information exploitation and heterogeneous knowledge aggregation on different
Machine Learning,Computer Vision and Pattern Recognition,Multimedia
What problem does this paper attempt to address?
The paper attempts to address the challenges posed by modality imbalance and input heterogeneity in Multimodal Federated Learning (MFL). Specifically: 1. **Modality Imbalance**: During the multimodal learning process, the learning speeds of different modalities are inconsistent, leading to dominant modalities suppressing the learning of weaker modalities, resulting in insufficient utilization of information. 2. **Input Heterogeneity**: The data distribution and modality configuration vary across different clients, further exacerbating the aforementioned modality imbalance issue. To address these problems, the authors propose a new framework called Cross-Modal Infiltration Federated Learning (FedCMI), which alleviates modality imbalance by transferring knowledge from the globally dominant modality to the locally weaker modality and maintains the information utilization of each modality through a dual projector design. Additionally, a class-level temperature adaptive scheme is introduced to achieve fair performance across different categories. Experimental results show that FedCMI significantly improves the performance of baseline methods in various scenarios.