AI Planning: A Primer and Survey (Preliminary Report)

Dillon Z. Chen,Pulkit Verma,Siddharth Srivastava,Michael Katz,Sylvie Thiébaux
2024-12-07
Abstract:Automated decision-making is a fundamental topic that spans multiple sub-disciplines in AI: reinforcement learning (RL), AI planning (AP), foundation models, and operations research, among others. Despite recent efforts to ``bridge the gaps'' between these communities, there remain many insights that have not yet transcended the boundaries. Our goal in this paper is to provide a brief and non-exhaustive primer on ideas well-known in AP, but less so in other sub-disciplines. We do so by introducing the classical AP problem and representation, and extensions that handle uncertainty and time through the Markov Decision Process formalism. Next, we survey state-of-the-art techniques and ideas for solving AP problems, focusing on their ability to exploit problem structure. Lastly, we cover subfields within AP for learning structure from unstructured inputs and learning to generalise to unseen scenarios and situations.
Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to bridge the gap between different sub - fields of artificial intelligence (AI), especially the gap between reinforcement learning (RL) and AI planning (AP). Although both fields are related to the problem of autonomous decision - making, their methods and tools are significantly different: 1. **Reinforcement Learning (RL)**: Discover the best course of action through interaction with an unknown environment and according to the reward signal. 2. **AI Planning (AP)**: Reason based on a structured model to solve long - term problems with sparse rewards. In order to better understand and utilize the advantages of these fields, the authors aim to provide a concise rather than exhaustive introduction, focusing on concepts and methods that are well - known in AP but less mentioned in other sub - fields. Specifically, the goals of the paper include: - **Introduce classical AP problems and their representations**, and extend to the Markov decision process (MDP) formalization for handling uncertainty and time. - **Outline the latest techniques and methods for solving AP problems**, especially those that can take advantage of the problem structure. - **Cover the sub - fields of learning structure from unstructured input and generalizing to unseen scenarios and situations**. ### Specific Problem Description The paper addresses the problem through the following aspects: 1. **Formalization and Representation**: - Introduces classical AP problems and representations, using first - order logic and the closed - world assumption to compactly encode world models with huge combinatorial nature. - Covers expressive extensions for handling uncertainty, time, and environmental processes. 2. **Taking Advantage of Structure**: - Emphasizes how to improve the efficiency and effectiveness of decision - making by taking advantage of the structure in the structured representation. - Mentions some analogies to common RL topics, such as the attention mechanism and automatic abstraction. 3. **Combination of Learning and Planning**: - Explores how to make planning computationally feasible by learning automatic abstraction and discovering structure. - Discusses methods to make planning faster through learning, such as using structure priors. 4. **Generalization Ability**: - Involves techniques that can handle multiple problems and are classified according to different access levels to the set of MDPs. - Includes sub - fields such as general - purpose planning (GP), learning for planning (L4P), and learning planning models (LPM). ### Summary Overall, this paper attempts to bridge the gap between these two fields by systematically introducing and comparing the methods and techniques of RL and AP. It not only provides a theoretical basis but also explores the latest progress and techniques in practical applications, aiming to help researchers better understand and utilize the advantages of these fields.