Abstract:Instruction tuning-supervised fine-tuning using instruction-response pairs-is a foundational step in transitioning pre-trained Large Language Models (LLMs) into helpful and safe chat assistants. Our hypothesis is that establishing an adequate output space can enable such a transition given the capabilities inherent in pre-trained LLMs. To verify this, we propose Response Tuning (RT), which eliminates the instruction-conditioning step in instruction tuning and solely focuses on response space supervision. Our experiments demonstrate that RT models, trained only using responses, can effectively respond to a wide range of instructions and exhibit helpfulness comparable to that of their instruction-tuned counterparts. Furthermore, we observe that controlling the training response distribution can significantly improve their user preference or elicit target behaviors such as refusing assistance for unsafe queries. Our findings illuminate the role of establishing an adequate output space in alignment, highlighting the potential of the extensive inherent capabilities of pre-trained LLMs.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: how to make pre - trained large - scale language models (LLMs) become useful and safe chat assistants without instruction tuning by establishing an appropriate output space (i.e., response space). Specifically, the paper proposes a method named "Response Tuning" (RT). This method omits the instruction - conditioning step in the traditional instruction - response pair tuning process and only focuses on the supervision of the response space. The author assumes that pre - trained LLMs already have the capabilities such as following instructions and evaluating safety. By appropriately setting the response space, these capabilities can be stimulated, enabling the model to respond effectively to various instructions like an instruction - tuned model and show similar helpfulness. The main contributions of the paper include: 1. Proposing the Response Tuning (RT) method and verifying that pre - trained LLMs can generate responses consistent with human needs only by establishing an appropriate output space. 2. Through extensive experimental evaluations, demonstrating the effectiveness of the RT model in handling a wide range of instructions, indicating that most of the instruction - following capabilities may have been learned during the pre - training stage. 3. Proving that by controlling the training response distribution, the user preference and safety of the model can be further improved. For example, by refining response attributes or adding a small number of safe - rejection examples, the performance of the model can be significantly improved. 4. Emphasizing the importance of the inherent capabilities of pre - trained LLMs and the role of establishing an appropriate response space during the adjustment process. Overall, this paper explores how, in the absence of instruction - response pairs, large - scale language models can better adapt to human needs through the supervision of the response space while maintaining their safety and usefulness.

Response Tuning: Aligning Large Language Models without Instruction

R-Tuning: Instructing Large Language Models to Say `I Don't Know'

Instruction-tuning Aligns LLMs to the Human Brain

From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning

Investigating Multilingual Instruction-Tuning: Do Polyglot Models Demand for Multilingual Instructions?

Instruction Tuning for Large Language Models: A Survey

A Closer Look at the Limitations of Instruction Tuning

I Learn Better If You Speak My Language: Understanding the Superior Performance of Fine-Tuning Large Language Models with LLM-Generated Responses

Instruction Pre-Training: Language Models are Supervised Multitask Learners

Know the Unknown: An Uncertainty-Sensitive Method for LLM Instruction Tuning

OPTune: Efficient Online Preference Tuning

Multilingual Instruction Tuning With Just a Pinch of Multilinguality

Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A Preliminary Study on Writing Assistance

Teaching Language Models to Self-Improve by Learning from Language Feedback

Fine-tuning Large Language Models with Sequential Instructions

Stronger Models are NOT Stronger Teachers for Instruction Tuning

Demystifying Instruction Mixing for Fine-tuning Large Language Models

X-Instruction: Aligning Language Model in Low-resource Languages with Self-curated Cross-lingual Instructions

From Base to Conversational: Japanese Instruction Dataset and Tuning Large Language Models

Non-instructional Fine-tuning: Enabling Instruction-Following Capabilities in Pre-trained Language Models without Instruction-Following Data

Instruction Following without Instruction Tuning