Zixuan Wu,Sean Ye,Manisha Natarajan,Matthew C. Gombolay
Abstract:Reinforcement Learning- (RL-)based motion planning has recently shown the potential to outperform traditional approaches from autonomous navigation to robot manipulation. In this work, we focus on a motion planning task for an evasive target in a partially observable multi-agent adversarial pursuit-evasion games (PEG). These pursuit-evasion problems are relevant to various applications, such as search and rescue operations and surveillance robots, where robots must effectively plan their actions to gather intelligence or accomplish mission tasks while avoiding detection or capture themselves. We propose a hierarchical architecture that integrates a high-level diffusion model to plan global paths responsive to environment data while a low-level RL algorithm reasons about evasive versus global path-following behavior. Our approach outperforms baselines by 51.2% by leveraging the diffusion model to guide the RL algorithm for more efficient exploration and improves the explanability and predictability.
What problem does this paper attempt to address?
### What problem does this paper attempt to solve?
This paper aims to solve the problem of how to design an effective motion - planning algorithm in partially observable multi - agent adversarial pursuit - evasion games (PEG) to help the evader avoid pursuers and reach the target successfully. Specifically:
1. **Research Background**:
- In the real world, in fields such as search - and - rescue operations and surveillance robots, robots must effectively plan their actions to collect intelligence or complete tasks while avoiding being detected or captured.
- Existing methods perform poorly in handling adversarial environments, especially in large - scale, partially observable environments.
2. **Problem Description**:
- **Partially Observable Environment**: The evader cannot fully observe all the information in the environment, such as camera positions and visibility maps.
- **High Sample Complexity**: Traditional reinforcement learning (RL) methods face the problem of high sample complexity when exploring large - scale, partially observable environments.
- **Lack of Expert Knowledge**: Many existing methods rely on expert knowledge or predefined behavior patterns, which limits their application in real - scene scenarios.
3. **Solution**:
- The paper proposes a hierarchical architecture that combines the diffusion model and reinforcement learning (RL) for efficient motion planning in partially observable multi - agent adversarial environments.
- **High - level Diffusion Model**: Used to generate global paths, satisfying static map constraints (such as terminal states and obstacle avoidance).
- **Low - level RL Algorithm**: Used to learn local evasive behaviors, ensuring that the evader can flexibly respond to the actions of pursuers while following the global path.
4. **Main Contributions**:
- Proposed a learning framework without the need for expert domain knowledge, suitable for large - scale, partially observable environments.
- Designed a novel hierarchical system that combines the diffusion model as a global path planner and RL as a local evasion strategy, which significantly outperforms the baseline methods.
- Introduced a task - oriented cost - map construction method, enhancing interpretability and predictability, facilitating the understanding of the evader's performance.
Through this method, the paper solves the problem of how to effectively plan the evader's path in complex, highly adversarial, and partially observable environments and shows its potential in practical applications.