On the Effects of Fine-tuning Language Models for Text-Based Reinforcement Learning

Mauricio Gruppi,Soham Dan,Keerthiram Murugesan,Subhajit Chaudhury
2024-04-16
Abstract:Text-based reinforcement learning involves an agent interacting with a fictional environment using observed text and admissible actions in natural language to complete a task. Previous works have shown that agents can succeed in text-based interactive environments even in the complete absence of semantic understanding or other linguistic capabilities. The success of these agents in playing such games suggests that semantic understanding may not be important for the task. This raises an important question about the benefits of LMs in guiding the agents through the game states. In this work, we show that rich semantic understanding leads to efficient training of text-based RL agents. Moreover, we describe the occurrence of semantic degeneration as a consequence of inappropriate fine-tuning of language models in text-based reinforcement learning (TBRL). Specifically, we describe the shift in the semantic representation of words in the LM, as well as how it affects the performance of the agent in tasks that are semantically similar to the training games. We believe these results may help develop better strategies to fine-tune agents in text-based RL scenarios.
Computation and Language
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to evaluate the impact of fine - tuning language models (LMs) on the training efficiency of agents in text - based reinforcement learning (TBRL), and whether such fine - tuning will lead to semantic degeneration, thereby affecting the performance of agents when handling tasks with out - of - training - set vocabulary. Specifically, the paper explores the following key issues: 1. **Can fine - tuning language models improve training efficiency?** The paper compares the performance of fixed pre - trained language models and fine - tuned language models during the training process through experiments to evaluate the impact of fine - tuning on training efficiency. 2. **Will fine - tuning language models enable agents to handle tasks containing out - of - training - set vocabulary more robustly?** Researchers test the performance of different models when facing game versions where the observation descriptions are rewritten or the vocabulary is replaced, in order to evaluate the impact of fine - tuning on the generalization ability of agents. The paper finds that although fine - tuning language models can accelerate the learning of agents on specific tasks, this practice may lead to semantic degeneration, that is, the language model "forgets" the semantic associations it learned during the pre - training stage, which will make agents perform poorly when dealing with slightly different or unseen vocabulary. Therefore, using a fixed pre - trained language model may better support the generalization ability of agents while maintaining semantic information. These research results are of great significance for developing better strategies to fine - tune agents in text - based reinforcement learning scenarios.