Abstract:First-person object-interaction tasks in high-fidelity, 3D, simulated environments such as the AI2Thor virtual home-environment pose significant sample-efficiency challenges for reinforcement learning (RL) agents learning from sparse task rewards. To alleviate these challenges, prior work has provided extensive supervision via a combination of reward-shaping, ground-truth object-information, and expert demonstrations. In this work, we show that one can learn object-interaction tasks from scratch without supervision by learning an attentive object-model as an auxiliary task during task learning with an object-centric relational RL agent. Our key insight is that learning an object-model that incorporates object-attention into forward prediction provides a dense learning signal for unsupervised representation learning of both objects and their relationships. This, in turn, enables faster policy learning for an object-centric relational RL agent. We demonstrate our agent by introducing a set of challenging object-interaction tasks in the AI2Thor environment where learning with our attentive object-model is key to strong performance. Specifically, we compare our agent and relational RL agents with alternative auxiliary tasks to a relational RL agent equipped with ground-truth object-information, and show that learning with our object-model best closes the performance gap in terms of both learning speed and maximum success rate. Additionally, we find that incorporating object-attention into an object-model's forward predictions is key to learning representations which capture object-category and object-state.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how, in a 3D simulation environment from a first - person perspective, an agent can learn to perform complex tasks involving multiple objects without information about real objects or demonstrations. Specifically, when the agent can only obtain feedback from task - completion signals, it makes learning to represent objects and support the correct object interactions required to complete the task very challenging. The paper proposes a new method to solve this problem by formulating the learning of an attentive object dynamics model as a classification problem, thereby enabling rapid learning of object - interaction tasks. The main contributions of the paper are: 1. Proposing a reinforcement - learning agent LOAD, which shows how to learn sparse - reward object - interaction tasks using only first - person vision without expert demonstrations, shape rewards, or knowledge of real objects. 2. Designing a novel attentive - object - model - assisted task, which formulates the learning of an object model as a classification problem. Through analysis, it is proven that for the 3D high - fidelity domain and the architecture used, it is crucial to learn an object representation that captures not only object categories but also object properties and object relationships. The paper verifies the effectiveness of the proposed method through experiments on a series of challenging food - preparation tasks in the virtual - home environment AI2Thor. The experimental results show that, compared with several related methods, LOAD is closest to the agent with real - object information in terms of learning speed and maximum success rate. In addition, through quantitative analysis of the learned object representations and the inter - object attention learned by each auxiliary task, evidence is provided to prove that their attentive - object model best learns the representation that matches the real information they provide.

Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in a First-person Simulated 3D Environment

Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in First-person Simulated 3D Environments

Relay Hindsight Experience Replay: Self-guided continual reinforcement learning for sequential object manipulation tasks with sparse rewards

Safe Deep RL in 3D Environments using Human Feedback

Real-World Human-Robot Collaborative Reinforcement Learning

Part-Guided 3D RL for Sim2Real Articulated Object Manipulation

Learning Sparse Control Tasks from Pixels by Latent Nearest-Neighbor-Guided Explorations

Human-Level Reinforcement Learning through Theory-Based Modeling, Exploration, and Planning

ACTRCE: Augmenting Experience via Teacher's Advice For Multi-Goal Reinforcement Learning

Visual Reinforcement Learning with Self-Supervised 3D Representations

Dealing with Sparse Rewards in Reinforcement Learning

Task-Induced Representation Learning

ASHA: Assistive Teleoperation via Human-in-the-Loop Reinforcement Learning

Data-efficient Deep Reinforcement Learning Method Toward Scaling Continuous Robotic Task with Sparse Rewards.

Deep Reinforcement Learning for 2D Physics-Based Object Manipulation in Clutter

Assessing Human Interaction in Virtual Reality With Continually Learning Prediction Agents Based on Reinforcement Learning Algorithms: A Pilot Study

Intrinsically Motivated Multi-Goal Reinforcement Learning Using Robotics Environment Integrated with OpenAI Gym

On the Efficacy of 3D Point Cloud Reinforcement Learning

Reconciling Reality through Simulation: A Real-to-Sim-to-Real Approach for Robust Manipulation

PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training

Accelerated Robot Learning via Human Brain Signals