Jorge de Heuvel,Nathan Corral,Benedikt Kreis,Jacobus Conradi,Anne Driemel,Maren Bennewitz
Abstract:For the best human-robot interaction experience, the robot's navigation policy should take into account personal preferences of the user. In this paper, we present a learning framework complemented by a perception pipeline to train a depth vision-based, personalized navigation controller from user demonstrations. Our virtual reality interface enables the demonstration of robot navigation trajectories under motion of the user for dynamic interaction scenarios. The novel perception pipeline enrolls a variational autoencoder in combination with a motion predictor. It compresses the perceived depth images to a latent state representation to enable efficient reasoning of the learning agent about the robot's dynamic environment. In a detailed analysis and ablation study, we evaluate different configurations of the perception pipeline. To further quantify the navigation controller's quality of personalization, we develop and apply a novel metric to measure preference reflection based on the Fréchet Distance. We discuss the robot's navigation performance in various virtual scenes and demonstrate the first personalized robot navigation controller that solely relies on depth images. A supplemental video highlighting our approach is available online.
What problem does this paper attempt to address?
The paper attempts to address the problem of how to enable robots to navigate according to users' personal preferences during dynamic interactions. Specifically, the authors propose a deep vision-based learning framework that trains personalized navigation controllers from user demonstrations via a virtual reality (VR) interface. The main problems and solutions of the paper are as follows:
### Problem Description
1. **Personalized Navigation Needs**: To achieve a natural and satisfactory human-robot interaction experience, the robot's navigation strategy should consider the user's personal preferences. Existing basic obstacle avoidance methods cannot meet individual preferences for proximity, trajectory shape, or navigation areas.
2. **Challenges in Dynamic Scenes**: In dynamic environments, users may move, so a method is needed to perceive and understand the user's movements in real-time, allowing the robot to navigate according to the user's preferences.
### Solutions
1. **Deep Vision-Based Perception Pipeline**: The authors introduce a lightweight, human-centered perception pipeline that compresses depth images into low-dimensional representations, helping the learning agent efficiently reason about the robot's dynamic environment.
2. **Virtual Reality (VR) Demonstration Framework**: Through the VR interface, users can intuitively draw navigation trajectories, which are used to train personalized navigation controllers.
3. **Combination of Reinforcement Learning and Behavior Cloning**: Using the Twin Delayed Deep Deterministic Policy Gradient (TD3) reinforcement learning architecture combined with behavior cloning loss ensures that the controller can learn the behavior patterns demonstrated by the user.
4. **Novel Preference Reflection Metric**: A new metric based on Fréchet distance is proposed and applied to quantify the personalization quality of the navigation controller.
### Main Contributions
1. **Personalized Navigation Controller Relying Solely on Deep Vision**: Achieved personalized navigation control using only deep vision data.
2. **Personalized Navigation in Dynamic Human-Robot Scenarios**: Recorded navigation preferences through the VR interface, applicable to scenarios where both robots and humans are moving.
3. **Novel Preference Reflection Metric**: Introduced a new metric to quantify the quality of navigation preference reflection.
4. **Detailed Analysis of Different Perception Configurations**: Conducted qualitative and quantitative analyses of different perception pipeline configurations, evaluating their performance in personalized navigation.
### Experimental Validation
1. **Perception Pipeline Configurations**: Evaluated the performance of different perception pipeline configurations, including standard human perception VAE-HA, human-unaware VAE-HU, no demonstration data VAE-ND, and human position prediction LSTM-HP.
2. **Qualitative Analysis**: Showcased the navigation behaviors of different configurations in selected scenarios, analyzing their strengths and weaknesses.
3. **Quantitative Analysis**: Assessed the performance of different controllers in various scenarios and human behavior patterns using metrics such as success rate, collision rate, and timeout rate.
4. **Preference Reflection Metric**: Quantified the personalization quality of different controllers using the proposed Fréchet distance metric.
In summary, this paper proposes an effective personalized robot navigation method by combining deep vision, virtual reality, and reinforcement learning, enabling navigation according to user preferences in dynamic scenes.