Beyond Natural Language: LLMs Leveraging Alternative Formats for Enhanced Reasoning and Communication

Weize Chen,Chenfei Yuan,Jiarui Yuan,Yusheng Su,Chen Qian,Cheng Yang,Ruobing Xie,Zhiyuan Liu,Maosong Sun
2024-06-19
Abstract:Natural language (NL) has long been the predominant format for human cognition and communication, and by extension, has been similarly pivotal in the development and application of Large Language Models (LLMs). Yet, besides NL, LLMs have seen various non-NL formats during pre-training, such as code and logical expression. NL's status as the optimal format for LLMs, particularly in single-LLM reasoning and multi-agent communication, has not been thoroughly examined. In this work, we challenge the default use of NL by exploring the utility of non-NL formats in these contexts. We show that allowing LLMs to autonomously select the most suitable format before reasoning or communicating leads to a 3.3 to 5.7\% improvement in reasoning efficiency for different LLMs, and up to a 72.7\% reduction in token usage in multi-agent communication, all while maintaining communicative effectiveness. Our comprehensive analysis further reveals that LLMs can devise a format from limited task instructions and that the devised format is effectively transferable across different LLMs. Intriguingly, the structured communication format decided by LLMs exhibits notable parallels with established agent communication languages, suggesting a natural evolution towards efficient, structured communication in agent communication. Our code is released at \url{<a class="link-external link-https" href="https://github.com/thunlp/AutoForm" rel="external noopener nofollow">this https URL</a>}.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem this paper attempts to address is: Is natural language (NL) the optimal format for large language models (LLM) in single-model inference and multi-agent communication? If not, how can the most suitable format for these applications be determined? Specifically, the paper explores the following aspects: 1. **Limitations of Natural Language**: Although natural language dominates human cognition and communication and plays a significant role in the development and application of LLMs, it is not always the optimal format. Particularly in single-model inference and multi-agent communication, other non-natural language formats (such as code, logical expressions, etc.) may be more effective. 2. **Effectiveness of Non-Natural Language Formats**: The paper experimentally verifies that allowing LLMs to autonomously choose the most suitable format for inference or communication can significantly improve efficiency. For example, in single-model inference, the reasoning efficiency of different LLMs increased by 3.3% to 5.7%, while in multi-agent communication, token usage was reduced by up to 72.7%, while maintaining the effectiveness of communication. 3. **Mechanism for Format Selection**: The paper proposes a simple and effective method, which involves adding instructions to the original Chain-of-Thought (CoT) prompts to guide LLMs in exploring and selecting non-natural language formats suitable for the current task. This method is called AutoForm (Automatic Format Decision). In summary, the paper aims to challenge the status of natural language as the default format and explore the potential of non-natural language formats in improving LLM inference capabilities and communication efficiency.