Differentially Private Deep Model-Based Reinforcement Learning

Alexandre Rio,Merwan Barlier,Igor Colin,Albert Thomas
2024-10-09
Abstract:We address private deep offline reinforcement learning (RL), where the goal is to train a policy on standard control tasks that is differentially private (DP) with respect to individual trajectories in the dataset. To achieve this, we introduce PriMORL, a model-based RL algorithm with formal differential privacy guarantees. PriMORL first learns an ensemble of trajectory-level DP models of the environment from offline data. It then optimizes a policy on the penalized private model, without any further interaction with the system or access to the dataset. In addition to offering strong theoretical foundations, we demonstrate empirically that PriMORL enables the training of private RL agents on offline continuous control tasks with deep function approximations, whereas current methods are limited to simpler tabular and linear Markov Decision Processes (MDPs). We furthermore outline the trade-offs involved in achieving privacy in this setting.
Machine Learning,Artificial Intelligence,Cryptography and Security
What problem does this paper attempt to address?