Abstract:Reinforcement learning (RL) is a powerful and flexible paradigm for searching for solutions in high-dimensional action spaces. However, bridging the gap between playing computer games with thousands of simulated episodes and solving real scientific problems with complex and involved environments (up to actual laboratory experiments) requires improvements in terms of sample efficiency to make the most of expensive information. The discovery of new drugs is a major commercial application of RL, motivated by the very large nature of the chemical space and the need to perform multiparameter optimization (MPO) across different properties. In silico methods, such as virtual library screening (VS) and de-novo molecular generation with RL, show great promise in accelerating this search. However, incorporation of increasingly complex computational models in these workflows requires increasing sample efficiency. Here, we introduce an active learning system linked with an RL model (RL-AL) for molecular design, which aims to improve the sample-efficiency of the optimization process. We identity and characterize unique challenges combining RL and AL, investigate the interplay between the systems, and develop a novel AL approach to solve the MPO problem. Our approach greatly expedites the search for novel solutions relative to baseline-RL for simple ligand- and structure-based oracle functions, with a 5 – 66-fold increase in hits generated for a fixed oracle budget and a 4 - 64-fold reduction in computational time to find a specific number of hits. Furthermore, compounds discovered through RL-AL display substantial enrichment of a multi-parameter scoring objective, indicating superior efficacy in curating high-scoring compounds, without a reduction in output diversity. This significant acceleration improves the feasibility of oracle functions that have largely been overlooked in RL due to high computational costs, for example free energy perturbation methods, and in principle is applicable to any RL domain.

Searching for High-Value Molecules Using Reinforcement Learning and Transformers

Evaluation of Reinforcement Learning in Transformer-based Molecular Design

Scalable Fragment-Based 3D Molecular Design with Reinforcement Learning

Sample Efficient Reinforcement Learning with Active Learning for Molecular Design

Enhancing Molecular Design through Graph-based Topological Reinforcement Learning

Molecule generation using transformers and policy gradient reinforcement learning

Molecular Design in Synthetically Accessible Chemical Space via Deep Reinforcement Learning

Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning

Improving Targeted Molecule Generation through Language Model Fine-Tuning Via Reinforcement Learning

Exploring Local Chemical Space in De Novo Molecular Generation Using Multi-Agent Deep Reinforcement Learning

Reinforcement Learning for Traversing Chemical Structure Space: Optimizing Transition States and Minimum Energy Paths of Molecules

FREED++: Improving RL Agents for Fragment-Based Molecule Generation by Thorough Reproduction

Optimization of binding affinities in chemical space with generative pre-trained transformer and deep reinforcement learning

Deep reinforcement learning in chemistry: A review

Reinforced Molecular Optimization with Neighborhood-Controlled Grammars

ChemGymRL: An Interactive Framework for Reinforcement Learning for Digital Chemistry

Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Generation

ACEGEN: Reinforcement learning of generative chemical agents for drug discovery

De Novo Drug Design Using Transformer-Based Machine Translation and Reinforcement Learning of an Adaptive Monte Carlo Tree Search

De novo design of protein target specific scaffold-based Inhibitors via Reinforcement Learning