Savvas Papaioannou,Panayiotis Kolios,Christos G. Panayiotou,Marios M. Polycarpou
Abstract:In the rapidly changing environments of disaster response, planning and decision-making for autonomous agents involve complex and interdependent choices. Although recent advancements have improved traditional artificial intelligence (AI) approaches, they often struggle in such settings, particularly when applied to agents operating outside their well-defined training parameters. To address these challenges, we propose an attention-based cognitive architecture inspired by Dual Process Theory (DPT). This framework integrates, in an online fashion, rapid yet heuristic (human-like) responses (System 1) with the slow but optimized planning capabilities of machine intelligence (System 2). We illustrate how a supervisory controller can dynamically determine in real-time the engagement of either system to optimize mission objectives by assessing their performance across a number of distinct attributes. Evaluated for trajectory planning in dynamic environments, our framework demonstrates that this synergistic integration effectively manages complex tasks by optimizing multiple mission objectives.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to plan and make decisions for autonomous agents (such as drones) in dynamic and unpredictable environments in disaster response scenarios. Specifically, the research aims to address the following challenges:
1. **Complex and interdependent choices**: In a rapidly changing disaster environment, planning and decision - making involve complex and interdependent choices.
2. **Operations beyond the scope of training parameters**: Traditional AI methods usually encounter difficulties when applied to agents that are beyond their well - defined training parameter ranges.
To address these challenges, the authors propose a cognitive architecture based on the attention mechanism, which is inspired by the Dual Process Theory (DPT). This framework integrates the fast but heuristic human - like response (System 1) with the slow but optimized machine - intelligence planning ability (System 2) online. By evaluating the performance of the two systems, the supervisory controller can dynamically determine in real - time which system to use to optimize the task objective.
### Specific problem description
The specific problem statement in the paper focuses on how to guide an Unmanned Aerial Vehicle (UAV) with shared autonomy to search for survivors in remote forest areas affected by wildfires. The goal of the drone is to reach the predetermined target area while avoiding the fire lines along the path. Suppose there is a Disaster Early Warning System (EWS), which is equipped with various sensors and data sources (such as weather stations and satellite images), providing real - time alerts and predicting the spread of fire lines. Therefore, the team's task is to design a drone trajectory that can not only reach the target area but also safely traverse the dynamic disaster environment.
The drone is designed to operate in two modes:
- **Semi - autonomous mode (System 1)**: It responds quickly but lacks optimization for task completion time, drone battery life, and energy consumption.
- **Autonomous mode (System 2)**: It uses an optimal controller to calculate control inputs to optimize task completion time or the drone's energy consumption, taking into account the drone's dynamics and fire line conditions.
### Problem formulation
The paper proposes a cognitive controller to solve the above problems, which is formulated as follows:
\[
(P1) \quad \text{Cognitive Controller}
\]
\[
\min_{\{u_{t+\tau|t}, x_{t+\tau|t}, S_t\}_{\tau = 1}^T} F_t(X_T, U_T, S_t),
\]
\[
\text{subject to: }
\begin{cases}
x_{t+\tau|t}=\Phi x_{t+\tau - 1|t}+\Gamma u_{t+\tau|t} & (\tau\in[1,..,T]) \\
x_{t|t}=x_{t|t - 1} \\
x_p^{t+\tau|t}\notin\Delta(C_i^{1:t}) & \forall i \\
\Psi_t = g(\Psi_{t - 1}, A_t) \\
S_t=\arg\max_{i\in\{1,2\}}\Psi_t(i) \\
x_{t+\tau|t}\in X, u_{t+\tau|t}\in U, i\in\{1,..,N_t\}
\end{cases}
\]
Here, $X_T=\{x_{t+\tau|t}\}_{\tau = 1}^T$ and $U_T=\{u_{t+\tau|t}\}_{\tau = 1}^T$ represent the state and control input sequences respectively, and $S_t\in\{1,2\}$ represents the active system (i.e., System 1 or System 2) selected by the supervisory controller. The objective function $F_t(·, S_t)$ depends on the active system $S_t$ and involves one of the following objectives:
- System response time
- Estimated task completion time
- Energy efficiency
The constraints ensure the dynamic limitations of the drone, avoidance of fire lines, and the system switching logic.