Enhancing Supermarket Robot Interaction: A Multi-Level LLM Conversational Interface for Handling Diverse Customer Intents

Chandran Nandkumar,Luka Peternel
2024-06-17
Abstract:This paper presents the design and evaluation of a novel multi-level LLM interface for supermarket robots to assist customers. The proposed interface allows customers to convey their needs through both generic and specific queries. While state-of-the-art systems like OpenAI's GPTs are highly adaptable and easy to build and deploy, they still face challenges such as increased response times and limitations in strategic control of the underlying model for tailored use-case and cost optimization. Driven by the goal of developing faster and more efficient conversational agents, this paper advocates for using multiple smaller, specialized LLMs fine-tuned to handle different user queries based on their specificity and user intent. We compare this approach to a specialized GPT model powered by GPT-4 Turbo, using the Artificial Social Agent Questionnaire (ASAQ) and qualitative participant feedback in a counterbalanced within-subjects experiment. Our findings show that our multi-LLM chatbot architecture outperformed the benchmarked GPT model across all 13 measured criteria, with statistically significant improvements in four key areas: performance, user satisfaction, user-agent partnership, and self-image enhancement. The paper also presents a method for supermarket robot navigation by mapping the final chatbot response to correct shelf numbers, enabling the robot to sequentially navigate towards the respective products, after which lower-level robot perception, control, and planning can be used for automated object retrieval. We hope this work encourages more efforts into using multiple, specialized smaller models instead of relying on a single powerful, but more expensive and slower model.
Robotics,Artificial Intelligence
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper aims to design and evaluate a new multi-tiered large language model (LLM) interface to enhance the customer interaction capabilities of supermarket robots. Specifically, the paper attempts to address the following issues: 1. **Response Time and Performance Optimization**: Existing advanced systems like OpenAI's GPT, while easy to build and deploy, have limitations in response time, cost optimization, and strategic control for specific use cases. The paper proposes using multiple smaller, specialized LLMs to handle different types of user queries to improve response speed and efficiency. 2. **Diversity and Complexity of User Intentions**: Supermarket robots need to handle a variety of user needs, from simple product inquiries to complex high-level intentions (such as recommending dinner menus or items needed for a party). Existing single powerful models may experience hallucinations and errors when dealing with such diversity, affecting user trust. 3. **User Experience and Satisfaction**: The paper evaluates the performance of the multi-tiered LLM architecture versus a customized model based on GPT-4 Turbo in terms of user satisfaction, user-agent partnership, and self-image enhancement to verify the effectiveness of the new approach. 4. **Navigation and Automated Task Execution**: The paper also proposes a method to map the final chatbot responses to the correct shelf numbers, enabling the robot to sequentially navigate to the corresponding product locations and use low-level robot perception, control, and planning for automated object retrieval. ### Main Contributions - **Multi-tiered LLM Architecture**: Designed a multi-tiered LLM architecture that improves system response speed and efficiency by using multiple specialized LLMs to handle different types of user queries. - **Experimental Evaluation**: Conducted comparative experiments using the Artificial Social Agent Questionnaire (ASAQ) and qualitative feedback to verify the superiority of the multi-tiered LLM architecture across multiple evaluation metrics. - **Navigation and Automation**: Proposed a method that combines chatbot responses with robot navigation to achieve automated task execution. ### Research Question The main research question of the paper is: How does our novel multi-tiered LLM conversational agent perform in the Artificial Social Agent Questionnaire (ASAQ) and qualitative evaluations compared to a customized advanced GPT model?