Abstract:Reinforcement learning (RL) often encounters delayed and sparse feedback in real-world applications, even with only episodic rewards. Previous approaches have made some progress in reward redistribution for credit assignment but still face challenges, including training difficulties due to redundancy and ambiguous attributions stemming from overlooking the multifaceted nature of mission performance evaluation. Hopefully, Large Language Model (LLM) encompasses fruitful decision-making knowledge and provides a plausible tool for reward redistribution. Even so, deploying LLM in this case is non-trivial due to the misalignment between linguistic knowledge and the symbolic form requirement, together with inherent randomness and hallucinations in inference. To tackle these issues, we introduce LaRe, a novel LLM-empowered symbolic-based decision-making framework, to improve credit assignment. Key to LaRe is the concept of the Latent Reward, which works as a multi-dimensional performance evaluation, enabling more interpretable goal attainment from various perspectives and facilitating more effective reward redistribution. We examine that semantically generated code from LLM can bridge linguistic knowledge and symbolic latent rewards, as it is executable for symbolic objects. Meanwhile, we design latent reward self-verification to increase the stability and reliability of LLM inference. Theoretically, reward-irrelevant redundancy elimination in the latent reward benefits RL performance from more accurate reward estimation. Extensive experimental results witness that LaRe (i) achieves superior temporal credit assignment to SOTA methods, (ii) excels in allocating contributions among multiple agents, and (iii) outperforms policies trained with ground truth rewards for certain tasks.

A Survey of Temporal Credit Assignment in Deep Reinforcement Learning

Towards Practical Credit Assignment for Deep Reinforcement Learning

Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis

Selective Credit Assignment

An Information-Theoretic Perspective on Credit Assignment in Reinforcement Learning

Credit Assignment: Challenges and Opportunities in Developing Human-like AI Agents

Assessing the Zero-Shot Capabilities of LLMs for Action Evaluation in RL

Towards Causal Credit Assignment

Model-based Credit Assignment for Model-free Deep Reinforcement Learning

Deep Reinforcement Learning with Credit Assignment for Combinatorial Optimization

On Credit Assignment in Hierarchical Reinforcement Learning

Learning to assign credit in reinforcement learning by incorporating abstract relations

STCA: Spatio-Temporal Credit Assignment with Delayed Feedback in Deep Spiking Neural Networks

STAS: Spatial-Temporal Return Decomposition for Multi-agent Reinforcement Learning.

STAS: Spatial-Temporal Return Decomposition for Solving Sparse Rewards Problems in Multi-agent Reinforcement Learning

Hindsight-DICE: Stable Credit Assignment for Deep Reinforcement Learning

Sequence Compression Speeds Up Credit Assignment in Reinforcement Learning

Reinforcement Learning in Credit Scoring and Underwriting

Multi-Level Credit Assignment for Cooperative Multi-Agent Reinforcement Learning

Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning

Credit Assignment During Movement Reinforcement Learning.