Abstract:This work focuses on autonomous contingency planning for scientific missions by enabling rapid policy computation from any off-nominal point in the state space in the event of a delay or deviation from the nominal mission plan. Successful contingency planning involves managing risks and rewards, often probabilistically associated with actions, in stochastic scenarios. Markov Decision Processes (MDPs) are used to mathematically model decision-making in such scenarios. However, in the specific case of planetary rover traverse planning, the vast action space and long planning time horizon pose computational challenges. A bi-level MDP framework is proposed to improve computational tractability, while also aligning with existing mission planning practices and enhancing explainability and trustworthiness of AI-driven solutions. We discuss the conversion of a mission planning MDP into a bi-level MDP, and test the framework on RoverGridWorld, a modified GridWorld environment for rover mission planning. We demonstrate the computational tractability and near-optimal policies achievable with the bi-level MDP approach, highlighting the trade-offs between compute time and policy optimality as the problem's complexity grows. This work facilitates more efficient and flexible contingency planning in the context of scientific missions.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to achieve autonomous emergency planning in scientific tasks. In particular, when planetary probes encounter delays or deviate from the normal mission plan, they can quickly calculate action strategies from any abnormal state point. Successful emergency planning requires managing risks and rewards, which are usually associated with the probabilities of actions, especially in stochastic scenarios. Markov Decision Processes (MDPs) are used to mathematically model this decision - making process. However, in the specific case of planetary probe path planning, the large action space and long - term planning time horizon pose computational challenges. For this reason, the paper proposes a two - level MDP framework, aiming to improve computational feasibility, be consistent with existing mission planning practices, and enhance the interpretability and credibility of AI solutions. Specifically, the paper addresses the above problems through the following points: 1. **Introducing the two - level MDP framework**: To deal with the computational complexity of traditional single - level MDPs in handling large - scale state spaces and long - term planning, the paper proposes a two - level MDP framework. This framework divides the decision - making process into a strategic level and a tactical level. The strategic level is responsible for determining the next goal, while the tactical level is responsible for specific path planning and action execution. 2. **Improving computational efficiency**: By decomposing the MDP into two levels, the state space and action space at each level are significantly reduced, thereby improving computational efficiency. The paper shows that this method can not only generate near - optimal strategies in a relatively short time but also maintain a high reward value. 3. **Enhancing interpretability and credibility**: The two - level structure allows the introduction of human preferences, making the decision - making process more transparent and traceable, thereby increasing the acceptance and credibility of AI solutions. The paper verifies the advantages of the two - level MDP framework in terms of computational efficiency and strategy quality by testing it in the RoverGridWorld environment, especially when the task complexity increases, it can effectively balance the relationship between computational time and strategy optimality.

Contingency Planning Using Bi-level Markov Decision Processes for Space Missions

The Importance of Adaptive Decision-Making for Autonomous Long-Range Planetary Surface Mobility

Markov Decision Processes For Multi-Objective Satellite Task Planning

Robotic Planning under Uncertainty in Spatiotemporal Environments in Expeditionary Science

Multi-Objective Risk Assessment Framework for Exploration Planning Using Terrain and Traversability Analysis

Non-myopic Planetary Exploration Combining In Situ and Remote Measurements

Hierarchical Motion Planning Under Probabilistic Temporal Tasks and Safe-Return Constraints

Risk-aware Meta-level Decision Making for Exploration Under Uncertainty

Safe Mission-Level Path Planning for Exploration of Lunar Shadowed Regions by a Solar-Powered Rover

Optimized Mission Planning for Planetary Exploration Rovers

Risk-Averse Stochastic Shortest Path Planning

Multi-agent Multi-target Path Planning in Markov Decision Processes

Safe Exploration in Markov Decision Processes with Time-Variant Safety using Spatio-Temporal Gaussian Process

Reinforcement Learning of Risk-Constrained Policies in Markov Decision Processes

Investigation of risk-aware MDP and POMDP contingency management autonomy for UAS

Uncertainty-Aware Trajectory Planning: Using Uncertainty Quantification and Propagation in Traversability Prediction of Planetary Rovers

Hybrid Planning for Dynamic Multimodal Stochastic Shortest Paths

Optimal Whole Body Trajectory Planning for Mobile Manipulators in Planetary Exploration and Construction

Planning under periodic observations: bounds and bounding-based solutions

PODDP: Partially Observable Differential Dynamic Programming for Latent Belief Space Planning