LLMs Could Autonomously Learn Without External Supervision
Ke Ji,Junying Chen,Anningzhe Gao,Wenya Xie,Xiang Wan,Benyou Wang
2024-06-07
Abstract:In the quest for super-human performance, Large Language Models (LLMs) have traditionally been tethered to human-annotated datasets and predefined training objectives-a process that is both labor-intensive and inherently limited. This paper presents a transformative approach: Autonomous Learning for LLMs, a self-sufficient learning paradigm that frees models from the constraints of human supervision. This method endows LLMs with the ability to self-educate through direct interaction with text, akin to a human reading and comprehending literature. Our approach eliminates the reliance on annotated data, fostering an Autonomous Learning environment where the model independently identifies and reinforces its knowledge gaps. Empirical results from our comprehensive experiments, which utilized a diverse array of learning materials and were evaluated against standard public quizzes, reveal that Autonomous Learning outstrips the performance of both Pre-training and Supervised Fine-Tuning (SFT), as well as retrieval-augmented methods. These findings underscore the potential of Autonomous Learning to not only enhance the efficiency and effectiveness of LLM training but also to pave the way for the development of more advanced, self-reliant AI systems.
Computation and Language
What problem does this paper attempt to address?
This paper proposes a new approach called Autonomous Learning to address the problem of excessive reliance on human-annotated data and predefined training objectives in Large Language Models (LLMs). Traditionally, LLMs learn from human feedback through pre-training, supervised fine-tuning, and reinforcement learning, which are time-consuming and limited. Autonomous Learning mimics the self-learning process of humans, enabling the model to directly interact with the text, self-comprehend, and reinforce knowledge without external supervision.
In Autonomous Learning, the model self-educates by reading textual materials such as books, documents, or Wikipedia and then self-assesses to identify knowledge gaps and reinforce learning. The advantages of this approach include:
1. The model actively participates in learning, understanding, and improving itself.
2. No need for human-annotated data, reducing reliance on human labor.
3. Simplifies the learning process, reducing training complexity and cost.
Experimental results show that Autonomous Learning outperforms pre-training, supervised fine-tuning, and retrieval-based augmentation methods on learning materials of various scales. This indicates that Autonomous Learning not only enhances the efficiency and effectiveness of LLM training but also contributes to the development of more advanced and self-sufficient artificial intelligence systems.