Position Paper: Agent AI Towards a Holistic Intelligence

Qiuyuan Huang,Naoki Wake,Bidipta Sarkar,Zane Durante,Ran Gong,Rohan Taori,Yusuke Noda,Demetri Terzopoulos,Noboru Kuno,Ade Famoti,Ashley Llorens,John Langford,Hoi Vo,Li Fei-Fei,Katsu Ikeuchi,Jianfeng Gao
2024-02-29
Abstract:Recent advancements in large foundation models have remarkably enhanced our understanding of sensory information in open-world environments. In leveraging the power of foundation models, it is crucial for AI research to pivot away from excessive reductionism and toward an emphasis on systems that function as cohesive wholes. Specifically, we emphasize developing Agent AI -- an embodied system that integrates large foundation models into agent actions. The emerging field of Agent AI spans a wide range of existing embodied and agent-based multimodal interactions, including robotics, gaming, and healthcare systems, etc. In this paper, we propose a novel large action model to achieve embodied intelligent behavior, the Agent Foundation Model. On top of this idea, we discuss how agent AI exhibits remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. Furthermore, we discuss the potential of Agent AI from an interdisciplinary perspective, underscoring AI cognition and consciousness within scientific discourse. We believe that those discussions serve as a basis for future research directions and encourage broader societal engagement.
Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the blurring of overall goals caused by excessive fragmentation in current artificial intelligence research, and how to achieve a more comprehensive artificial intelligence system through the integration of multimodal data and cross - domain applications. Specifically, the paper emphasizes the importance of developing **Agent AI** - a system capable of integrating large - scale basic models (such as large - language models and visual - language models) into agent actions. The goal of Agent AI is to achieve a higher - level intelligent behavior by combining functions such as learning, memory, action, perception, planning, and cognition, so as to perform tasks in the physical and virtual worlds. The paper proposes a new large - scale action model - Agent Foundation Model, aiming to support the realization of this comprehensive intelligence, and explores the application potential of Agent AI in multiple fields such as robotics, games, and healthcare, as well as its potential impact on artificial intelligence cognition and consciousness. The main contributions of the paper are as follows: 1. **Proposing a new paradigm of Agent AI**: Emphasizing the importance of achieving a more comprehensive artificial intelligence system through the integration of multimodal data and cross - domain applications. 2. **Agent Foundation Model**: Introducing a new large - scale action model aimed at supporting the realization of comprehensive intelligence. 3. **Interdisciplinary discussion**: Discussing the potential of Agent AI from the perspectives of multiple disciplines such as neuroscience, biology, physics, biophysics, cognitive science, medical health, and moral philosophy. 4. **Future research directions**: Exploring the development directions of Agent AI, including issues such as ethical challenges that need to be addressed. Through these discussions, the paper aims to illustrate how the development of these technologies makes AI agents closer to achieving artificial general intelligence (AGI) and holistic intelligence (HI).