Abstract:The success of deep reinforcement learning (DRL) hinges on the availability of training data, which is typically obtained via a large number of environment interactions. In many real-world scenarios, costs and risks are associated with gathering these data. The field of offline reinforcement learning addresses these issues through outsourcing the collection of data to a domain expert or a carefully monitored program and subsequently searching for a batch-constrained optimal policy. With the emergence of data markets, an alternative to constructing a dataset in-house is to purchase external data. However, while state-of-the-art offline reinforcement learning approaches have shown a lot of promise, they currently rely on carefully constructed datasets that are well aligned with the intended target domains. This raises questions regarding the transferability and robustness of an offline reinforcement learning agent trained on externally acquired data. In this paper, we empirically evaluate the ability of the current state-of-the-art offline reinforcement learning approaches to coping with the source-target domain mismatch within two MuJoCo environments, finding that current state-of-the-art offline reinforcement learning algorithms underperform in the target domain. To address this, we propose data valuation for offline reinforcement learning (DVORL), which allows us to identify relevant and high-quality transitions, improving the performance and transferability of policies learned by offline reinforcement learning algorithms. The results show that our method outperforms offline reinforcement learning baselines on two MuJoCo environments.

Data time travel and consistent market making: taming reinforcement learning in multi-agent systems with anonymous data

Deep Reinforcement Trading with Predictable Returns

Reinforcement Learning in High-frequency Market Making

Multi-agent reinforcement learning in a realistic limit order book market simulation

Reinforcement Learning in Agent-Based Market Simulation: Unveiling Realistic Stylized Facts and Behavior

Reinforcement Learning in Non-Markov Market-Making

Agent Performing Autonomous Stock Trading under Good and Bad Situations

A Novel Anti-Risk Method for Portfolio Trading Using Deep Reinforcement Learning

Data Valuation for Offline Reinforcement Learning

Stock market microstructure inference via multi-agent reinforcement learning

Efficient Online Reinforcement Learning with Offline Data

Extending Deep Reinforcement Learning Frameworks in Cryptocurrency Market Making

Deep Reinforcement Learning For Trading - A Critical Survey

Deep reinforcement learning on a multi-asset environment for trading

Learning to simulate realistic limit order book markets from data as a World Agent

Deep differentiable reinforcement learning and optimal trading

Deep Reinforcement Learning in Quantitative Algorithmic Trading: A Review

PerSim: Data-Efficient Offline Reinforcement Learning with Heterogeneous Agents via Personalized Simulators

Putting Data at the Centre of Offline Multi-Agent Reinforcement Learning

Performance of Deep Reinforcement Learning for High Frequency Market Making on Actual Tick Data.

Reinforcement Learning for Market Making in a Multi-agent Dealer Market