Abstract:Cooperation plays a pivotal role in the evolution of human intelligence; moreover, it also underlies the recent revolutionary advancement of artificial intelligence (AI) that is driven by foundation models. Specifically, we reveal that the training of foundation models can be interpreted as a form of big cooperative learning (\textit{abbr.} big learning), where massive learning individuals/tasks \emph{cooperate} to approach the unique essence of data from diverse perspectives of data prediction, leveraging a universal model. The presented big learning therefore unifies most training objectives of foundation models within a consistent framework, where their underlying assumptions are exposed simultaneously. We design tailored simulations to demonstrate the principle of big learning, based on which we provide learning-perspective justifications for the successes of foundation models, with interesting side-products. Furthermore, we reveal that big learning is a new dimension for upgrading conventional machine learning paradigms, valuable for endowing reinvigorations to associated applications; as an illustrative example, we propose the BigLearn-GAN, which is a novel adversarially-trained foundation model with versatile data sampling capabilities. Code is available at \texttt{<a class="link-external link-https" href="https://github.com/YulaiCong/BigCooperativeLearning" rel="external noopener nofollow">this https URL</a>}.

What problem does this paper attempt to address?

The problem this paper attempts to address is the training-test discrepancy in existing foundational model training methods. Specifically, while current foundational models like BERT and GPT have achieved significant success in certain tasks, their training methods often utilize only a portion of the information in data samples, leading to inconsistencies in the model's capabilities during training and testing. For example, the mask-and-predict training method primarily focuses on predicting the masked parts, which may not be the most needed capabilities during actual testing; in contrast, the next-token-prediction training method, although closer to the needs during testing, still has certain limitations. To address this issue, the paper proposes the concept of "Big Cooperative Learning" (Big Learning). The core idea of Big Learning is to fully utilize the various data sampling demonstrations contained in a single data sample (i.e., sampling and predicting the data from different perspectives) to form a large number of learning tasks that cooperate to approximate the essence of the data. This method not only reduces the training-test discrepancy but also improves the model's generalization ability and adaptability. The main contributions of the paper include: 1. Proposing the concept of Big Learning as a unified foundational model training framework and analyzing the assumptions behind existing foundational models. 2. Designing specific simulation experiments to demonstrate the principles of Big Learning in a lightweight manner and explaining the reasons for the success of foundational models from a learning perspective. 3. Pointing out that Big Learning is a new dimension for enhancing traditional machine learning paradigms, applying cutting-edge foundational model technologies to traditional machine learning through knowledge feedback, thereby revitalizing related applications. 4. As an example, proposing BigLearn-GAN, a variant of the traditional Generative Adversarial Network (GAN) improved based on Big Learning, with powerful multimodal data sampling capabilities. Through these contributions, the paper aims to provide theoretical support and technical guidance for the further improvement and development of future foundational models.

Big Cooperative Learning

Learning Intra-group Cooperation in Multi-agent Systems.

CoSCL: Cooperation of Small Continual Learners is Stronger than a Big One

Large-scale Foundation Models and Generative AI for BigData Neuroscience

Structured Cooperative Learning with Graphical Model Priors

Deep Broad Learning - Big Models for Big Data

Learning to Cooperate with Humans using Generative Agents

Building Cooperative Embodied Agents Modularly with Large Language Models

Generalization in Cooperative Multi-Agent Systems

Cooperative Learning of Multi-Agent Systems Via Reinforcement Learning

Scaling Large-Language-Model-based Multi-Agent Collaboration

Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition

Learngene: From Open-World to Your Learning Task

Emergent collective intelligence from massive-agent cooperation and competition

Robot Learning in the Era of Foundation Models: A Survey

Cooperative Policy Learning with Pre-trained Heterogeneous Observation Representations

Cooperative Open-ended Learning Framework for Zero-shot Coordination

Bringing Generative AI to Adaptive Learning in Education

Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents

Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding