A Survey on Robotics with Foundation Models: toward Embodied AI

Zhiyuan Xu,Kun Wu,Junjie Wen,Jinming Li,Ning Liu,Zhengping Che,Jian Tang
2024-02-04
Abstract:While the exploration for embodied AI has spanned multiple decades, it remains a persistent challenge to endow agents with human-level intelligence, including perception, learning, reasoning, decision-making, control, and generalization capabilities, so that they can perform general-purpose tasks in open, unstructured, and dynamic environments. Recent advances in computer vision, natural language processing, and multi-modality learning have shown that the foundation models have superhuman capabilities for specific tasks. They not only provide a solid cornerstone for integrating basic modules into embodied AI systems but also shed light on how to scale up robot learning from a methodological perspective. This survey aims to provide a comprehensive and up-to-date overview of foundation models in robotics, focusing on autonomous manipulation and encompassing high-level planning and low-level control. Moreover, we showcase their commonly used datasets, simulators, and benchmarks. Importantly, we emphasize the critical challenges intrinsic to this field and delineate potential avenues for future research, contributing to advancing the frontier of academic and industrial discourse.
Robotics,Artificial Intelligence
What problem does this paper attempt to address?
The paper focuses on how to apply foundational models to robots, especially in autonomous manipulation tasks, in order to enhance the intelligence level of robots in open and dynamic environments, including high-level planning and low-level control. Current methods have limitations in generalization and transferability, while pre-trained foundational models demonstrate superhuman capabilities in specific tasks. The paper discusses how to integrate these models into decision-making models to achieve high-level planning (such as understanding commands, environment perception, and complex task reasoning) and low-level control (such as precise execution of parameters). Additionally, the paper also discusses datasets, simulators, and benchmark tests, as well as challenges and directions for future research, aiming to advance the frontier of robot learning.