Optimizing Multimodal Federated Learning: Novel Approaches for Efficient Model Aggregation and Client Sampling

Cheng Wu,Hong Zhong,Guilin Chen,Naji Alhusaini,Shenghui Zhao,Yuchen Zhang
DOI: https://doi.org/10.1109/nana63151.2024.00030
2024-01-01
Abstract:Federated learning effectively solves the problems of data privacy leakage and communication overhead in centralized machine learning by transferring the model training process from a central server to local devices. Since most of the real world has multimodal data scenarios, the scalability of federated learning systems that can only deal with unimodal local data is limited. Recently, researchers have attempted to apply federated learning to multimodal tasks and have achieved good performance. However, existing multimodal federated learning methods still face some challenges, such as model drift, slow model convergence, insufficient labeled data at the client, and high communication costs. To this end, this paper proposes a new multimodal federated learning scheme, which adopts an embedded knowledge transfer mechanism and a semi-supervised learning method, and improves the client selection strategy and server aggregation mechanism. By quantifying the client update degree and introducing a weighted aggregation mechanism, more targeted model optimization is realized. The experimental results show that our method significantly improves the model performance and achieves satisfactory results in downstream tasks, providing new research ideas and a practical basis for developing multimodal federated learning.
What problem does this paper attempt to address?