Abstract:The fixed-size context of Transformer makes GPT models incapable of generating arbitrarily long text. In this paper, we introduce RecurrentGPT, a language-based simulacrum of the recurrence mechanism in RNNs. RecurrentGPT is built upon a large language model (LLM) such as ChatGPT and uses natural language to simulate the Long Short-Term Memory mechanism in an LSTM. At each timestep, RecurrentGPT generates a paragraph of text and updates its language-based long-short term memory stored on the hard drive and the prompt, respectively. This recurrence mechanism enables RecurrentGPT to generate texts of arbitrary length without forgetting. Since human users can easily observe and edit the natural language memories, RecurrentGPT is interpretable and enables interactive generation of long text. RecurrentGPT is an initial step towards next-generation computer-assisted writing systems beyond local editing suggestions. In addition to producing AI-generated content (AIGC), we also demonstrate the possibility of using RecurrentGPT as an interactive fiction that directly interacts with consumers. We call this usage of generative models by ``AI As Contents'' (AIAC), which we believe is the next form of conventional AIGC. We further demonstrate the possibility of using RecurrentGPT to create personalized interactive fiction that directly interacts with readers instead of interacting with writers. More broadly, RecurrentGPT demonstrates the utility of borrowing ideas from popular model designs in cognitive science and deep learning for prompting LLMs. Our code is available at <a class="link-external link-https" href="https://github.com/aiwaves-cn/RecurrentGPT" rel="external noopener nofollow">this https URL</a> and an online demo is available at <a class="link-external link-https" href="https://www.aiwaves.org/recurrentgpt" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The core problem that this paper attempts to solve is that existing language models based on the Transformer architecture (such as GPT) are unable to generate text of arbitrary length due to their fixed context window size. Specifically, these models are prone to problems such as content repetition or decreased coherence when generating long texts. To solve this problem, the author introduced **RECURRENT GPT**, a language model based on the natural - language - simulated recurrence mechanism. ### Main Problems and Solutions 1. **Limitations of the Fixed Context Window**: - **Problem**: Existing large - language models (LLMs), such as ChatGPT, due to their Transformer - based architecture, can only handle a fixed - length context window, which limits their ability to generate long texts. - **Solution**: By introducing **RECURRENT GPT**, use natural language to simulate the long - short - term memory (LSTM) mechanism in the recurrent neural network (RNN), thereby achieving the generation of text of arbitrary length. 2. **Improving the Coherence and Interpretability of Generated Texts**: - **Problem**: The traditional RNN - based recurrence mechanism is difficult to scale, and the generated text is likely to lose coherence over a long time span. - **Solution**: **RECURRENT GPT** uses natural language as building blocks, allowing humans to observe and edit its internal states (such as long - term and short - term memory), thereby improving the interpretability and controllability of the model. 3. **Enhancing Human - Computer Interaction Capability**: - **Problem**: Existing computer - aided writing systems mainly provide local editing suggestions and lack support for long - text generation. - **Solution**: **RECURRENT GPT** can not only automatically generate long texts but also serve as an interactive writing assistant, helping human writers easily generate texts of arbitrary length, reducing a great deal of manual labor. 4. **Exploring New Application Scenarios**: - **Problem**: Generation models are usually only used for content creation and lack application scenarios for direct interaction with consumers. - **Solution**: The paper proposes a new paradigm - "AI as Content" (AIAC), that is, the generation model can directly interact with consumers, for example, in the form of personalized interactive novels, enabling users to choose and explore different story development paths. ### Technical Implementation - **Natural - Language Building Blocks**: Replace the vector components (input, output, hidden state, cell state) in LSTM with natural - language paragraphs. - **Prompt Engineering**: Simulate the RNN's computational graph through carefully designed prompt templates and simple Python code. - **Long - term and Short - term Memory**: Use VectorDB to store long - term memory, and summarize the information of the most recent time steps in the form of natural - language paragraphs for short - term memory. ### Experimental Verification The paper verifies the effectiveness of **RECURRENT GPT** through three experimental setups: 1. Automatically generate long texts. 2. Collaborate with human writers to generate long texts. 3. Directly interact with consumers as an interactive novel. The experimental results show that **RECURRENT GPT** is significantly superior to the baseline model in generating interesting and coherent long texts, especially showing greater advantages when generating longer novels. In conclusion, this paper successfully solves the bottlenecks encountered by existing language models in generating long texts by introducing **RECURRENT GPT**, and provides new ideas and methods for future computer - aided writing systems.

RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text

RecycleGPT: An Autoregressive Language Model with Recyclable Module

Controllable Text Generation with Residual Memory Transformer

Demystifying ChatGPT: An In-depth Survey of OpenAI's Robust Large Language Models

BatGPT: A Bidirectional Autoregessive Talker from Generative Pre-trained Transformer

A Latent Variable Model with Hierarchical Structure and GPT-2 for Long Text Generation

TextGAIL: Generative Adversarial Imitation Learning for Text Generation.

Unlocking the Power of GANs in Non-Autoregressive Text Generation

Long Text Generation via Adversarial Training with Leaked Information

Text Feature Adversarial Learning for Text Generation With Knowledge Transfer From GPT2

In Search of Needles in a 11M Haystack: Recurrent Memory Finds What LLMs Miss

Learning, teaching, and assessment with generative artificial intelligence: towards a plateau of productivity

Transformer Explainer: Interactive Learning of Text-Generative Models

Self-Evolving GPT: A Lifelong Autonomous Experiential Learner

Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation

GPT is becoming a Turing machine: Here are some ways to program it

PatternGPT :A Pattern-Driven Framework for Large Language Model Text Generation

RetGen: A Joint Framework for Retrieval and Grounded Text Generation Modeling

Real-time interactive sequence generation and control with Recurrent Neural Network ensembles

ModelGPT: Unleashing LLM's Capabilities for Tailored Model Generation