Abstract:The recent development of Foundation Models (FMs), represented by large language models, vision transformers, and multimodal models, has been making a significant impact on both academia and industry. Compared with small-scale models, FMs have a much stronger demand for high-volume data during the pre-training phase. Although general FMs can be pre-trained on data collected from open sources such as the Internet, domain-specific FMs need proprietary data, posing a practical challenge regarding the amount of data available due to privacy concerns. Federated Learning (FL) is a collaborative learning paradigm that breaks the barrier of data availability from different participants. Therefore, it provides a promising solution to customize and adapt FMs to a wide range of domain-specific tasks using distributed datasets whilst preserving privacy. This survey paper discusses the potentials and challenges of synergizing FL and FMs and summarizes core techniques, future directions, and applications. A periodically updated paper collection on FM-FL is available at <a class="link-external link-https" href="https://github.com/lishenghui/awesome-fm-fl" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The problems that this paper attempts to solve mainly focus on how to combine Federated Learning (FL) and Foundation Models (FMs) to address the challenges of privacy protection and data availability when training high - quality models in specific domains. Specifically: 1. **Privacy protection issues**: In fields such as law, medicine, and finance, data is usually highly sensitive, so it is difficult to collect and use centrally. Traditional model training methods require the centralized storage and processing of a large amount of data, which may lead to the risk of privacy leakage. FL provides a privacy - protecting method by allowing different participants to collaborate in training models without sharing the original data. 2. **Data availability issues**: For FMs in specific domains, a large amount of proprietary data is required for pre - training or fine - tuning, but this data is often difficult to obtain due to privacy restrictions. FL can expand the availability of data without affecting privacy by using decentralized data sources, thereby improving the diversity and robustness of the model. 3. **Model adaptability issues**: In order to make FMs perform well on specific tasks, they usually need to be fine - tuned. However, in the FL environment, due to the heterogeneity of data distribution and differences in computing resources, directly applying traditional fine - tuning methods may lead to performance degradation. Therefore, researching how to effectively fine - tune FMs in the FL framework so that they can adapt to different downstream tasks and device characteristics is one of the focuses of this paper. 4. **Resource efficiency issues**: FMs usually have a large number of parameters and high training and communication costs. In the FL environment, especially in resource - constrained scenarios such as mobile devices, how to optimize the training process of FMs and reduce computing and communication overheads is also an urgent problem to be solved. In summary, this paper aims to explore how to combine the advantages of FL and FM to solve the above challenges, thereby promoting the development of specific - domain models under privacy protection and promoting the development of related technologies.

Synergizing Foundation Models and Federated Learning: A Survey

Advances and Open Challenges in Federated Foundation Models

When Foundation Model Meets Federated Learning: Motivations, Challenges, and Future Directions

A Survey on Efficient Federated Learning Methods for Foundation Model Training

Federated Foundation Models: Privacy-Preserving and Collaborative Learning for Large Models

Open Challenges and Opportunities in Federated Foundation Models Towards Biomedical Healthcare

Position Paper: Assessing Robustness, Privacy, and Fairness in Federated Learning Integrated with Foundation Models

A Generalized Look at Federated Learning: Survey and Perspectives

Bridging the Gap Between Foundation Models and Heterogeneous Federated Learning

Federated learning: A cutting-edge survey of the latest advancements and applications

A Multifaceted Survey on Federated Learning: Fundamentals, Paradigm Shifts, Practical Issues, Recent Developments, Partnerships, Trade-Offs, Trustworthiness, and Ways Forward

The Role of Federated Learning in a Wireless World with Foundation Models

Multimodal Federated Learning: A Survey

Federated Large Language Models: Current Progress and Future Directions

FedMS: Federated Learning with Mixture of Sparsely Activated Foundations Models

A Survey of What to Share in Federated Learning: Perspectives on Model Utility, Privacy Leakage, and Communication Efficiency

Advancements in Federated Learning: Models, Methods, and Privacy

Federated Generative Learning with Foundation Models

Navigating the Future of Federated Recommendation Systems with Foundation Models

A Comprehensive Survey of Federated Transfer Learning: Challenges, Methods and Applications

Advances and Open Problems in Federated Learning