Decision-Making in Robotic Grasping with Large Language Models.

Jianfeng Liao,Haoyang Zhang,Haofu Qian,Qiwei Meng,Yinan Sun,Yao Sun,Wei Song ,Shiqiang Zhu,Jason Gu
DOI: https://doi.org/10.1007/978-981-99-6495-6_36
2023-01-01
Abstract:Recent advances in large language models have highlighted their potential to encode massive amounts of semantic knowledge for long-term autonomous decision-making, positioning them as a promising solution for powering the cognitive capabilities of future home-assistant robots. However, while large language models can provide high-level decision, there is still no unified paradigm for integrating them with robots’ perception and low-level action. In this paper, we propose a framework centered around a large language model, integrated with visual perception and motion planning modules, to investigate the robotic grasping task. Unlike traditional methods that only focus on generating stable grasps, our proposed approach can handle personalized user instructions and perform tasks more effectively in home scenarios. Our approach integrates existing state-of-the-art models in a simple and effective way, without requiring any fine-tuning, which makes it low-cost and easy to deploy. Experiments on a physical robot system demonstrate the feasibility of our approach.
What problem does this paper attempt to address?