Abstract:The aim of Reinforcement Learning (RL) in real-world applications is to create systems capable of making autonomous decisions by learning from their environment through trial and error. This paper emphasizes the importance of reward engineering and reward shaping in enhancing the efficiency and effectiveness of reinforcement learning algorithms. Reward engineering involves designing reward functions that accurately reflect the desired outcomes, while reward shaping provides additional feedback to guide the learning process, accelerating convergence to optimal policies. Despite significant advancements in reinforcement learning, several limitations persist. One key challenge is the sparse and delayed nature of rewards in many real-world scenarios, which can hinder learning progress. Additionally, the complexity of accurately modeling real-world environments and the computational demands of reinforcement learning algorithms remain substantial obstacles. On the other hand, recent advancements in deep learning and neural networks have significantly improved the capability of reinforcement learning systems to handle high-dimensional state and action spaces, enabling their application to complex tasks such as robotics, autonomous driving, and game playing. This paper provides a comprehensive review of the current state of reinforcement learning, focusing on the methodologies and techniques used in reward engineering and reward shaping. It critically analyzes the limitations and recent advancements in the field, offering insights into future research directions and potential applications in various domains.

Reward Shaping via Diffusion Process in Reinforcement Learning

Diffusion Spectral Representation for Reinforcement Learning

Learning to Optimally Stop a Diffusion Process

Towards Controllable Diffusion Models via Reward-Guided Exploration

Multimodal Reward Shaping for Efficient Exploration in Reinforcement Learning

Extracting Reward Functions from Diffusion Models

Exploration by Maximizing Renyi Entropy for Reward-Free RL Framework.

Diffusion Reward: Learning Rewards via Conditional Video Diffusion

Maximum Entropy Inverse Reinforcement Learning of Diffusion Models with Energy-Based Models

Policy Representation via Diffusion Probability Model for Reinforcement Learning

MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Predictable Reinforcement Learning Dynamics through Entropy Rate Minimization

Reward Uncertainty for Exploration in Preference-based Reinforcement Learning

A reinforcement learning diffusion decision model for value-based decisions

Comprehensive Overview of Reward Engineering and Shaping in Advancing Reinforcement Learning Applications

Reinforcement Learning for Jump-Diffusions, with Financial Applications

Intrinsic Rewards for Exploration Without Harm From Observational Noise: A Simulation Study Based on the Free Energy Principle

Principled Reward Shaping for Reinforcement Learning Via Lyapunov Stability Theory

Exploration Entropy for Reinforcement Learning

Off-Policy Maximum Entropy RL with Future State and Action Visitation Measures

Reasoning with Latent Diffusion in Offline Reinforcement Learning