Grasp-Anything: Large-scale Grasp Dataset from Foundation Models

An Dinh Vuong,Minh Nhat Vu,Hieu Le,Baoru Huang,Binh Huynh,Thieu Vo,Andreas Kugi,Anh Nguyen
DOI: https://doi.org/10.48550/arXiv.2309.09818
2023-09-18
Abstract:Foundation models such as ChatGPT have made significant strides in robotic tasks due to their universal representation of real-world domains. In this paper, we leverage foundation models to tackle grasp detection, a persistent challenge in robotics with broad industrial applications. Despite numerous grasp datasets, their object diversity remains limited compared to real-world figures. Fortunately, foundation models possess an extensive repository of real-world knowledge, including objects we encounter in our daily lives. As a consequence, a promising solution to the limited representation in previous grasp datasets is to harness the universal knowledge embedded in these foundation models. We present Grasp-Anything, a new large-scale grasp dataset synthesized from foundation models to implement this solution. Grasp-Anything excels in diversity and magnitude, boasting 1M samples with text descriptions and more than 3M objects, surpassing prior datasets. Empirically, we show that Grasp-Anything successfully facilitates zero-shot grasp detection on vision-based tasks and real-world robotic experiments. Our dataset and code are available at <a class="link-external link-https" href="https://grasp-anything-2023.github.io" rel="external noopener nofollow">this https URL</a>.
Robotics,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?