Abstract:Large Language Models (LLMs) have demonstrated remarkable versatility across various domains. To further advance LLMs, we propose 'SELF' (Self-Evolution with Language Feedback), a novel approach that enables LLMs to self-improve through self-reflection, akin to human learning processes. SELF initiates with a meta-skill learning process that equips the LLMs with capabilities for self-feedback and self-refinement. Subsequently, the model undergoes an iterative process of self-evolution. In each iteration, it utilizes an unlabeled dataset of instructions to generate initial responses. These responses are enhanced through self-feedback and self-refinement. The model is then fine-tuned using this enhanced data. The model undergoes progressive improvement through this iterative self-evolution process. Moreover, the SELF framework enables the model to apply self-refinement during inference, which further improves response quality. Our experiments in mathematics and general tasks demonstrate that SELF can enhance the capabilities of LLMs without human intervention. The SELF framework indicates a promising direction for the autonomous evolution of LLMs, transitioning them from passive information receivers to active participants in their development.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how large - language models (LLMs) can achieve autonomous evolution. Although existing LLMs perform well in various tasks, they mainly rely on manually - annotated datasets and reinforcement - learning methods to improve their performance, and these methods require a large amount of resources and continuous human intervention. Therefore, the paper proposes a new framework named SELF (Self - Evolution with Language Feedback), aiming to enable LLMs to autonomously improve their performance and reduce their dependence on external intervention through the ability of self - feedback and self - refinement. Specifically, the SELF framework contains two main stages: 1. **Meta - skill learning stage**: At this stage, the model learns the key meta - skills of self - feedback and self - refinement through a limited number of supervised examples. This lays the foundation for subsequent self - evolution. 2. **Self - evolution stage**: At this stage, the model utilizes the acquired meta - skills to gradually improve its own ability through a multi - round iterative self - evolution training process. In each round of iteration, the model will autonomously generate high - quality training data and use this data for self - training, thereby continuously optimizing its performance. In addition, the SELF framework also emphasizes the importance of natural - language feedback in guiding the model evolution process. Compared with traditional scalar rewards, natural - language feedback provides a more detailed and comprehensive evaluation, which helps the model to discover errors and propose improvement directions in complex reasoning tasks. The paper verifies the effectiveness of SELF through experiments in mathematics and general fields. The results show that SELF can not only significantly improve the performance of the model on mathematical tasks, but also achieve better results on general tasks, proving its potential in promoting the autonomous evolution of LLMs.

SELF: Self-Evolution with Language Feedback

A Survey on Self-Evolution of Large Language Models

Language Model Self-improvement by Reinforcement Learning Contemplation

Interactive Evolution: A Neural-Symbolic Self-Training Framework For Large Language Models

Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models

Self-Evolved Reward Learning for LLMs

Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation

Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

Can Large Language Models Invent Algorithms to Improve Themselves?

Efficient Self-Improvement in Multimodal Large Language Models: A Model-Level Judge-Free Approach

Can I understand what I create? Self-Knowledge Evaluation of Large Language Models

Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching

Auto-Evolve: Enhancing Large Language Model's Performance via Self-Reasoning Framework

Self-Rewarding Language Models

Bring Your Own Data! Self-Supervised Evaluation for Large Language Models

Self-Updatable Large Language Models with Parameter Integration

Internal Consistency and Self-Feedback in Large Language Models: A Survey

Self-Evolving GPT: A Lifelong Autonomous Experiential Learner

LLMs Could Autonomously Learn Without External Supervision

SelfIE: Self-Interpretation of Large Language Model Embeddings

SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales