Abstract:Artificial intelligence (AI) and especially reinforcement learning (RL) have the potential to enable agents to learn and perform tasks autonomously with superhuman performance. However, we consider RL as fundamentally a Human-in-the-Loop (HITL) paradigm, even when an agent eventually performs its task autonomously. In cases where the reward function is challenging or impossible to define, HITL approaches are considered particularly advantageous. The application of Reinforcement Learning from Human Feedback (RLHF) in systems such as ChatGPT demonstrates the effectiveness of optimizing for user experience and integrating their feedback into the training loop. In HITL RL, human input is integrated during the agent's learning process, allowing iterative updates and fine-tuning based on human feedback, thus enhancing the agent's performance. Since the human is an essential part of this process, we argue that human-centric approaches are the key to successful RL, a fact that has not been adequately considered in the existing literature. This paper aims to inform readers about current explainability methods in HITL RL. It also shows how the application of explainable AI (xAI) and specific improvements to existing explainability approaches can enable a better human-agent interaction in HITL RL for all types of users, whether for lay people, domain experts, or machine learning specialists. Accounting for the workflow in HITL RL and based on software and machine learning methodologies, this article identifies four phases for human involvement for creating HITL RL systems: (1) Agent Development, (2) Agent Learning, (3) Agent Evaluation, and (4) Agent Deployment. We highlight human involvement, explanation requirements, new challenges, and goals for each phase. We furthermore identify low-risk, high-return opportunities for explainability research in HITL RL and present long-term research goals to advance the field. Finally, we propose a vision of human-robot collaboration that allows both parties to reach their full potential and cooperate effectively.

Human-Centered Reinforcement Learning: A Survey

A Survey of Reinforcement Learning from Human Feedback

A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges

Comprehensive Survey of Reinforcement Learning: From Algorithms to Practical Challenges

Mapping out the Space of Human Feedback for Reinforcement Learning: A Conceptual Framework

Hierarchical Reinforcement Learning: A Survey and Open Research Challenges

Human-in-the-Loop Reinforcement Learning: A Survey and Position on Requirements, Challenges, and Opportunities

A Survey on Interactive Reinforcement Learning: Design Principles and Open Challenges

The History and Risks of Reinforcement Learning and Human Feedback

Human-Level Reinforcement Learning through Theory-Based Modeling, Exploration, and Planning

Evolutionary Reinforcement Learning: A Survey

Reinforcement Learning in Healthcare: A Survey

Reinforcement Learning from Diverse Human Preferences

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Causal Reinforcement Learning: A Survey

A Survey On Enhancing Reinforcement Learning in Complex Environments: Insights from Human and LLM Feedback

A Human-Centered Safe Robot Reinforcement Learning Framework with Interactive Behaviors

A Survey on Causal Reinforcement Learning

A Design Trajectory Map of Human-AI Collaborative Reinforcement Learning Systems: Survey and Taxonomy

Reinforcement Learning Algorithms: A brief survey

A Survey of Human-in-the-loop for Machine Learning