IDIL: Imitation Learning of Intent-Driven Expert Behavior

Sangwon Seo,Vaibhav Unhelkar
2024-04-26
Abstract:When faced with accomplishing a task, human experts exhibit intentional behavior. Their unique intents shape their plans and decisions, resulting in experts demonstrating diverse behaviors to accomplish the same task. Due to the uncertainties encountered in the real world and their bounded rationality, experts sometimes adjust their intents, which in turn influences their behaviors during task execution. This paper introduces IDIL, a novel imitation learning algorithm to mimic these diverse intent-driven behaviors of experts. Iteratively, our approach estimates expert intent from heterogeneous demonstrations and then uses it to learn an intent-aware model of their behavior. Unlike contemporary approaches, IDIL is capable of addressing sequential tasks with high-dimensional state representations, while sidestepping the complexities and drawbacks associated with adversarial training (a mainstay of related techniques). Our empirical results suggest that the models generated by IDIL either match or surpass those produced by recent imitation learning benchmarks in metrics of task performance. Moreover, as it creates a generative model, IDIL demonstrates superior performance in intent inference metrics, crucial for human-agent interactions, and aptly captures a broad spectrum of expert behaviors.
Machine Learning,Artificial Intelligence,Robotics
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is the modeling problem of expert behavior in imitation learning, especially when these behaviors are driven by intentions and change over time. Traditional imitation learning methods usually assume that the expert's behavior only depends on the observable task state, ignoring the influence of the underlying factor of expert intention. This leads to the deficiency of the model in explaining and predicting the diversity of human behaviors. Specifically, the paper proposes IDIL (Imitation Learning of Intent - Driven Expert Behavior), a new imitation learning algorithm, which aims to imitate the intent - driven behaviors exhibited by different experts when performing tasks. IDIL iteratively estimates expert intentions and uses these intentions to learn an intention - aware behavior model. Compared with existing methods, IDIL has the following advantages: 1. **Handling high - dimensional state spaces**: IDIL can handle sequential tasks with high - dimensional state representations without using complex and potentially unstable training methods such as adversarial training. 2. **More accurate intention inference**: Since IDIL creates a generative model, it performs well in terms of intention inference metrics, which is crucial for human - computer interaction. 3. **Capturing a wide range of behavior patterns**: IDIL can capture a wider range of expert behaviors, thus providing a more comprehensive expert behavior generation model. To achieve this, IDIL adopts an iterative method to learn intent - driven expert behavior through the following steps: - **E - step (Expectation step)**: Infer the expert's intention to enhance the existing demonstration data. - **M1 - step**: Update the agent policy based on the enhanced data, so that the agent can better imitate the expert's behavior. - **M2 - step**: Update the agent's intention dynamic model to reflect the change of intention over time. In addition, the paper also discusses the relationship between IDIL and other related works, and verifies the effectiveness of IDIL through experiments. The experimental results show that IDIL either matches or exceeds the recent imitation learning benchmarks in terms of task performance metrics, and also performs excellently in terms of intention inference metrics. In conclusion, by introducing the IDIL algorithm, this paper solves the limitations of traditional imitation learning methods in modeling intent - driven expert behavior, providing a new way to understand and simulate human behavior more accurately.