Abstract:Autonomous agents operating in adversarial scenarios face a fundamental challenge: while they may know their adversaries' high-level objectives, such as reaching specific destinations within time constraints, the exact policies these adversaries will employ remain unknown. Traditional approaches address this challenge by treating the adversary's state as a partially observable element, leading to a formulation as a Partially Observable Markov Decision Process (POMDP). However, the induced belief-space dynamics in a POMDP require knowledge of the system's transition dynamics, which, in this case, depend on the adversary's unknown policy. Our key observation is that while an adversary's exact policy is unknown, their behavior is necessarily constrained by their mission objectives and the physical environment, allowing us to characterize the space of possible behaviors without assuming specific policies. In this paper, we develop Task-Aware Behavior Fields (TAB-Fields), a representation that captures adversary state distributions over time by computing the most unbiased probability distribution consistent with known constraints. We construct TAB-Fields by solving a constrained optimization problem that minimizes additional assumptions about adversary behavior beyond mission and environmental requirements. We integrate TAB-Fields with standard planning algorithms by introducing TAB-conditioned POMCP, an adaptation of Partially Observable Monte Carlo Planning. Through experiments in simulation with underwater robots and hardware implementations with ground robots, we demonstrate that our approach achieves superior performance compared to baselines that either assume specific adversary policies or neglect mission constraints altogether. Evaluation videos and code are available at <a class="link-external link-https" href="https://tab-fields.github.io" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

This paper attempts to solve the problem of how autonomous agents (such as robots) can effectively plan in an adversarial environment without knowing the specific strategies of their opponents. Specifically, although autonomous agents may be aware of the high - level goals of their opponents (for example, reaching a certain destination within a specific time), the specific behavioral strategies of the opponents are unknown. Traditional methods usually regard the state of the opponent as a partially observable element and handle this problem through the Partially Observable Markov Decision Process (POMDP). However, this method depends on the known system transition dynamics, and in this case, since the opponent's strategy is unknown, the transition dynamics are also uncertain. ### Main contributions of the paper To solve the above problems, the authors propose **Task - Aware Behavior Fields (TAB - Fields)**, a representation method based on the maximum entropy principle, which is used to capture the change in the opponent's state distribution over time. TAB - Fields calculates the least - biased probability distribution by solving a constrained optimization problem. This distribution only needs to satisfy the known task and environmental constraints without assuming a specific opponent strategy. This enables autonomous agents to plan and make decisions more effectively in an uncertain adversarial environment. ### Specific implementation methods 1. **Constructing TAB - Fields**: Calculate the least - biased probability distribution of the opponent's state by maximizing entropy under the given task and environmental constraints. 2. **Integrating into the planning algorithm**: Combine TAB - Fields with standard planning algorithms (such as Partially Observable Monte Carlo Planning, POMCP) to form a new planning method - **POMCP under TAB - conditioned (TAB - conditioned POMCP)**. This method can effectively update the belief state and plan without the need to explicitly know the opponent's strategy. ### Experimental verification Through simulation experiments and hardware experiments (using ground robots and underwater robots), the authors show that their method outperforms traditional baseline methods in multiple task scenarios (such as interception, avoidance, etc.). The experimental results show that TAB - POMCP not only performs better in the Average Task Completion Rate (ATCR), but also significantly reduces the Average Interception Steps (StI), demonstrating its superior performance in an adversarial environment. ### Summary This paper proposes a novel method to deal with the uncertainty problem in an adversarial environment through Task - Aware Behavior Fields (TAB - Fields), thereby achieving more effective autonomous planning. This method avoids making assumptions about the specific strategies of opponents and only depends on task and environmental constraints, and has broad application prospects.

TAB-Fields: A Maximum Entropy Framework for Mission-Aware Adversarial Planning

Active Perception in Adversarial Scenarios using Maximum Entropy Deep Reinforcement Learning

Partially Observable Task and Motion Planning with Uncertainty and Risk Awareness

Capability-aware Task Allocation and Team Formation Analysis for Cooperative Exploration of Complex Environments

Deceptive Planning for Resource Allocation

Game-theoretic Objective Space Planning

Model-free Motion Planning of Autonomous Agents for Complex Tasks in Partially Observable Environments

Weathering Ongoing Uncertainty: Learning and Planning in a Time-Varying Partially Observable Environment

Learning Coordinated Maneuver in Adversarial Environments

Multi-Agent Path Planning Under Observation Schedule Constraints

Asymptotically Optimal Belief Space Planning in Discrete Partially-Observable Domains

A Unified Framework for Planning in Adversarial and Cooperative Environments

Safe POMDP Online Planning among Dynamic Agents via Adaptive Conformal Prediction

EnCoMP: Enhanced Covert Maneuver Planning with Adaptive Threat-Aware Visibility Estimation using Offline Reinforcement Learning

Bridging the Gap between Discrete Agent Strategies in Game Theory and Continuous Motion Planning in Dynamic Environments

Probabilistic Visibility-Aware Trajectory Planning for Target Tracking in Cluttered Environments

PODDP: Partially Observable Differential Dynamic Programming for Latent Belief Space Planning

HARPS: An Online POMDP Framework for Human-Assisted Robotic Planning and Sensing

Planning for Attacker Entrapment in Adversarial Settings

ConceptAgent: LLM-Driven Precondition Grounding and Tree Search for Robust Task Planning and Execution

Real-World Deployment of a Hierarchical Uncertainty-Aware Collaborative Multiagent Planning System