kNN-ICL: Compositional Task-Oriented Parsing Generalization with Nearest Neighbor In-Context Learning

Wenting Zhao,Ye Liu,Yao Wan,Yibo Wang,Qingyang Wu,Zhongfen Deng,Jiangshu Du,Shuaiqi Liu,Yunlong Xu,Philip S. Yu
2023-12-18
Abstract:Task-Oriented Parsing (TOP) enables conversational assistants to interpret user commands expressed in natural language, transforming them into structured outputs that combine elements of both natural language and intent/slot tags. Recently, Large Language Models (LLMs) have achieved impressive performance in synthesizing computer programs based on a natural language prompt, mitigating the gap between natural language and structured programs. Our paper focuses on harnessing the capabilities of LLMs for semantic parsing tasks, addressing the following three key research questions: 1) How can LLMs be effectively utilized for semantic parsing tasks? 2) What defines an effective prompt? and 3) How can LLM overcome the length constraint and streamline prompt design by including all examples as prompts? We introduce k Nearest Neighbor In-Context Learning(kNN-ICL), which simplifies prompt engineering by allowing it to be built on top of any design strategy while providing access to all demo examples. Extensive experiments show that: 1)Simple ICL without kNN search can achieve a comparable performance with strong supervised models on the TOP tasks, and 2) kNN-ICL significantly improves the comprehension of complex requests by seamlessly integrating ICL with a nearest-neighbor approach. Notably, this enhancement is achieved without the need for additional data or specialized prompts.
Computation and Language
What problem does this paper attempt to address?
The paper is primarily dedicated to addressing several key issues in Task-Oriented Parsing (TOP), particularly how to effectively leverage Large Language Models (LLMs) to improve the performance of TOP tasks. Specifically, the research focuses on the following aspects: 1. **How to effectively utilize LLMs for semantic parsing tasks?** The paper transforms the TOP task into a code generation task, mapping the semantic parse tree to Python-style API code to fit the output format of LLMs. 2. **What kind of prompt design is most effective for using LLMs in TOP tasks?** The research analyzes the impact of different prompt design strategies on the performance of LLMs, including the use of API documentation and example selection methods (such as random selection, unsupervised selection based on SentenceBERT, and supervised selection based on imitation). 3. **How to overcome the length limitations of LLMs and simplify prompt design to include all examples as prompts?** To address this issue, the paper proposes a method called k-Nearest Neighbors In-Context Learning (kNN-ICL), which allows LLMs to access all available examples during inference, thereby improving model performance. Through the above research, the paper aims to explore how to better utilize the powerful capabilities of LLMs to handle complex natural language understanding tasks, especially in the TOP scenario. By improving prompt design strategies and introducing the kNN-ICL method, the model's ability to understand and generate complex requests is significantly enhanced without the need for additional data or specialized prompt design. Experimental results show that the proposed kNN-ICL method can significantly improve the model's performance on TOP tasks, particularly in handling nested structures in APIs.