Abstract:Despite their recent advancements, Large Language Models (LLMs) still struggle to directly generate correct plans for complex multi-constraint planning problems, even with self-verification and self-critique. For example, a U.S. domestic travel planning benchmark TravelPlanner was proposed in Xie et al. (2024), where the best LLM OpenAI o1-preview can only find travel plans that satisfy user requirements with a 10% success rate given all needed information. In this work, we tackle this difficult problem by proposing an LLM-based planning framework that formalizes and solves complex multi-constraint planning problems as constrained satisfiability problems, which are further consumed by sound and complete satisfiability solvers. We start with TravelPlanner as the primary use case and achieve a success rate of 93.9%. We demonstrate our framework's robustness by showing its effectiveness in diverse paraphrased prompts. More importantly, our framework has strong zero-shot generalizability: It can successfully handle unseen constraints in a completely unseen international travel dataset we created, and it can even generalize well to new domains such as routing and task allocation problems in a zero-shot manner. Moreover, when user input queries are infeasible, our framework can identify the unsatisfiable core, provide failure reasons, and offers personalized modification suggestions to users according to diverse human preferences. We show that our framework can modify and solve for an average of 81.6% and 91.7% unsatisfiable queries from two datasets and prove with ablations that all key components of our framework are effective and necessary.

Automatic Verification of Sound Abstractions for Generalized Planning

Sound Abstraction of Probabilistic Actions in The Constraint Mass Assignment Framework

Automatic abstraction for C program

Large Language Models Can Solve Real-World Planning Rigorously with Formal Verification Tools

Counterexample-guided Planning

Abstraction-based model checking programs

Heuristic-Guided Abstraction Refinement

Building and Refining Abstract Planning Cases by Change of Representation Language

Formal Verification for C Program

Hierarchical Decomposition and Analysis for Generalized Planning

Joint Verification and Refinement of Language Models for Safety-Constrained Planning

Verifying Programs Using Abstraction and Theorem Proving.

Competent Predicate Abstraction in Model Checking.

Learning Planning Abstractions from Language

Planning as Model Checking Tasks

Counterexample-Guided Abstraction Refinement for Component-Based Systems

Abstraction Logic: The Marriage of Contextual Refinement and Separation Logic

Model Checking Approach to Automated Planning

Proof-Carrying Plans: a Resource Logic for AI Planning

Automated Planning Techniques for Elementary Proofs in Abstract Algebra

On the Planning Abilities of Large Language Models (A Critical Investigation with a Proposed Benchmark)