Federated Learning in Big Model Era: Domain-Specific Multimodal Large Models

Zengxiang Li,Zhaoxiang Hou,Hui Liu,Ying Wang,Tongzhi Li,Longfei Xie,Chao Shi,Chengyi Yang,Weishan Zhang,Zelei Liu,Liang Xu
2023-08-24
Abstract:Multimodal data, which can comprehensively perceive and recognize the physical world, has become an essential path towards general artificial intelligence. However, multimodal large models trained on public datasets often underperform in specific industrial domains. This paper proposes a multimodal federated learning framework that enables multiple enterprises to utilize private domain data to collaboratively train large models for vertical domains, achieving intelligent services across scenarios. The authors discuss in-depth the strategic transformation of federated learning in terms of intelligence foundation and objectives in the era of big model, as well as the new challenges faced in heterogeneous data, model aggregation, performance and cost trade-off, data privacy, and incentive mechanism. The paper elaborates a case study of leading enterprises contributing multimodal data and expert knowledge to city safety operation management , including distributed deployment and efficient coordination of the federated learning platform, technical innovations on data quality improvement based on large model capabilities and efficient joint fine-tuning approaches. Preliminary experiments show that enterprises can enhance and accumulate intelligent capabilities through multimodal model federated learning, thereby jointly creating an smart city model that provides high-quality intelligent services covering energy infrastructure safety, residential community security, and urban operation management. The established federated learning cooperation ecosystem is expected to further aggregate industry, academia, and research resources, realize large models in multiple vertical domains, and promote the large-scale industrial application of artificial intelligence and cutting-edge research on multimodal federated learning.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
### What Problem Does This Paper Attempt to Solve? This paper primarily explores how to enhance the performance of multimodal large models in specific domains through Federated Learning in the era of large-scale models. Specifically, the paper attempts to address the following key issues: 1. **Enhancing Model Performance in Specific Domains**: - Current large-scale multimodal models trained on public datasets perform well on general tasks but often fall short in specific industrial applications. The paper proposes a multimodal federated learning framework that allows multiple enterprises to use their private data to jointly train large models tailored for specific domains. 2. **Data Heterogeneity and Multimodality**: - Different enterprises possess data with varying modalities, formats, and distributions, requiring an effective aggregation strategy to tackle these challenges. The paper proposes various methods to handle different modalities of data, including adapter fine-tuning, feature map knowledge distillation, etc. 3. **Model Structure and Size**: - Large models have a massive number of parameters, making traditional federated learning methods difficult to sustain in terms of communication and computation costs. The paper proposes a flexible model aggregation strategy to decide "which parts to aggregate" and "how to aggregate" to balance model training costs and performance. 4. **Data Privacy and Security**: - In the era of large-scale models, data privacy protection is particularly important. The paper discusses various privacy protection technologies, such as Trusted Execution Environments, Secure Multi-Party Computation, Differential Privacy, etc., and proposes methods to prevent model outputs from leaking sensitive information of enterprises. 5. **Incentive Mechanisms and Cooperation**: - As model complexity increases, evaluating the value and contribution of each participant's data becomes more challenging. The paper proposes an improved incentive mechanism to assess the uniqueness of data and domain knowledge, thereby enhancing the value of domain-specific models. ### Conclusion By proposing a novel multimodal federated learning framework, this paper addresses the issue of enhancing the performance of domain-specific models in the era of large-scale models and discusses related technologies and challenges in detail. This framework is expected to promote the widespread application of artificial intelligence in multiple vertical domains.