Abstract:Large Language Models (LLMs) have proven to be useful tools in various domains outside of the field of their inception, which was natural language processing. In this study, we provide practical directions on how to use LLMs to generate 2D-game rooms for an under-development game, named Metavoidal. Our technique can harness the power of GPT-3 by Human-in-the-loop fine-tuning which allows our method to create 37% Playable-Novel levels from as scarce data as only 60 hand-designed rooms under a scenario of the non-trivial game, with respect to (Procedural Content Generation) PCG, that has a good amount of local and global constraints.

What problem does this paper attempt to address?

The paper attempts to address the problem of how to use large language models (LLMs) to generate game levels with practical value, especially in data-scarce situations. Specifically, the authors explore how to use GPT-3 to generate room levels in a 2D game called Metavoidal. These levels not only need to meet playability but also require novelty and certain local and global constraints. ### Main Issues: 1. **Data Scarcity**: Only 60 hand-designed room levels are available as initial data. 2. **Complexity of Level Generation**: The generated levels need to meet multiple local and global constraints, such as path connectivity, wall layout, and puddle distribution. 3. **Playability and Novelty**: The generated levels need to be playable and have a certain degree of novelty to increase the game's fun and challenge. ### Solutions: 1. **Data Augmentation**: Increase the diversity and quantity of data through methods such as horizontal and vertical flipping, rotation, and pattern tile swapping. 2. **Human-in-the-loop Fine-tuning**: Use a human-in-the-loop approach to fine-tune GPT-3, gradually fixing unplayable levels and adding compliant levels to the training data. 3. **Multi-stage Generation**: Divided into two stages, the first stage generates preliminary levels through human-in-the-loop, and the second stage generates more high-quality levels through data augmentation and further fine-tuning. ### Experimental Results: - In the first stage, through 5 rounds of fine-tuning and 100 generations, 60 repairable levels were finally obtained. - In the second stage, through data augmentation and further fine-tuning, 37% of the levels generated were playable and novel. ### Conclusion: This study demonstrates an effective method for using large language models to generate high-quality game levels in data-scarce situations. This method is not only applicable to the Metavoidal game but can also be extended to other game developments with complex constraints. Future research directions include exploring the generation of 3D game levels and developing a general model capable of generating both 2D and 3D levels.

Practical PCG Through Large Language Models

Level Generation Through Large Language Models

Language Urban Odyssey: A Serious Game for Enhancing Second Language Acquisition Through Large Language Models

3D-GPT: Procedural 3D Modeling with Large Language Models

Large Language Models and Games: A Survey and Roadmap

Grammar-based Game Description Generation using Large Language Models

Game Generation via Large Language Models

From Code to Play: Benchmarking Program Search for Games Using Large Language Models

Can Large Language Models Play Games? A Case Study of A Self-Play Approach

Can Large Language Models Play Text Games Well? Current State-of-the-Art and Open Questions

Automatically Generating CS Learning Materials with Large Language Models

Evaluating Large Language Models with Grid-Based Game Competitions: An Extensible LLM Benchmark and Leaderboard

Benchmarking Large Language Model (LLM) Performance for Game Playing via Tic-Tac-Toe

Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning

LLM4DS: Evaluating Large Language Models for Data Science Code Generation

GameTraversalBenchmark: Evaluating Planning Abilities Of Large Language Models Through Traversing 2D Game Maps

Procedural Content Generation in Games: A Survey with Insights on Emerging LLM Integration

Remember what you did so you know what to do next

Large Language Models as Agents in Two-Player Games

Show, Don't Tell: Evaluating Large Language Models Beyond Textual Understanding with ChildPlay