Abstract:When robots enter everyday human environments, they need to understand their tasks and how they should perform those tasks. To encode these, reward functions, which specify the objective of a robot, are employed. However, designing reward functions can be extremely challenging for complex tasks and environments. A promising approach is to learn reward functions from humans. Recently, several robot learning works embrace this approach and leverage human demonstrations to learn the reward functions. Known as inverse reinforcement learning, this approach relies on a fundamental assumption: humans can provide near-optimal demonstrations to the robot. Unfortunately, this is rarely the case: human demonstrations to the robot are often suboptimal due to various reasons, e.g., difficulty of teleoperation, robot having high degrees of freedom, or humans' cognitive limitations. This thesis is an attempt towards learning reward functions from human users by using other, more reliable data modalities. Specifically, we study how reward functions can be learned using comparative feedback, in which the human user compares multiple robot trajectories instead of (or in addition to) providing demonstrations. To this end, we first propose various forms of comparative feedback, e.g., pairwise comparisons, best-of-many choices, rankings, scaled comparisons; and describe how a robot can use these various forms of human feedback to infer a reward function, which may be parametric or non-parametric. Next, we propose active learning techniques to enable the robot to ask for comparison feedback that optimizes for the expected information that will be gained from that user feedback. Finally, we demonstrate the applicability of our methods in a wide variety of domains, ranging from autonomous driving simulations to home robotics, from standard reinforcement learning benchmarks to lower-body exoskeletons.

A Generalized Acquisition Function for Preference-based Reward Learning

Batch Active Learning of Reward Functions from Human Preferences

Generalizing Across Multi-Objective Reward Functions in Deep Reinforcement Learning

Generative Adversarial Reward Learning for Generalized Behavior Tendency Inference

General Preference Modeling with Preference Representations for Aligning Language Models

Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input

Reinforcement Learning from Diverse Human Preferences

Hybrid Reinforcement Learning Based on Human Preference and Advice for Efficient Robot Skill Learning

Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both

Meta-Reward-Net: Implicitly Differentiable Reward Learning for Preference-based Reinforcement Learning

Learning Preferences for Interactive Autonomy

Learning a Universal Human Prior for Dexterous Manipulation from Human Preference

Relative Behavioral Attributes: Filling the Gap between Symbolic Goal Specification and Reward Learning from Human Preferences

On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization

Boosting Reward Model with Preference-Conditional Multi-Aspect Synthetic Data Generation

Adaptive Preference Scaling for Reinforcement Learning with Human Feedback

Human-Guided Robot Behavior Learning: A GAN-Assisted Preference-Based Reinforcement Learning Approach

Towards Comprehensive Preference Data Collection for Reward Modeling

Efficient Language-instructed Skill Acquisition via Reward-Policy Co-Evolution

Reward Uncertainty for Exploration in Preference-based Reinforcement Learning

Few-shot In-Context Preference Learning Using Large Language Models