Improving Targeted Molecule Generation through Language Model Fine-Tuning Via Reinforcement Learning

Salma J. Ahmed,Mustafa A. Elattar
2024-05-11
Abstract:Developing new drugs is laborious and costly, demanding extensive time investment. In this study, we introduce an innovative de-novo drug design strategy, which harnesses the capabilities of language models to devise targeted drugs for specific proteins. Employing a Reinforcement Learning (RL) framework utilizing Proximal Policy Optimization (PPO), we refine the model to acquire a policy for generating drugs tailored to protein targets. Our method integrates a composite reward function, combining considerations of drug-target interaction and molecular validity. Following RL fine-tuning, our approach demonstrates promising outcomes, yielding notable improvements in molecular validity, interaction efficacy, and critical chemical properties, achieving 65.37 for Quantitative Estimation of Drug-likeness (QED), 321.55 for Molecular Weight (MW), and 4.47 for Octanol-Water Partition Coefficient (logP), respectively. Furthermore, out of the generated drugs, only 0.041\% do not exhibit novelty.
Biomolecules,Machine Learning
What problem does this paper attempt to address?
This paper mainly discusses how to improve target molecule generation in drug design process by using reinforcement learning (RL) and language model fine-tuning. In the study, the authors propose a new drug design strategy that utilizes language models to generate drug molecules for specific proteins. They use the Proximal Policy Optimization (PPO) algorithm to optimize the model and design a comprehensive reward function that considers both drug-target interactions and molecular effectiveness. Under the RL framework, the fine-tuned model can generate drug molecules for specific proteins, improving molecular effectiveness, interaction efficiency, and key chemical properties. Experimental results show that the drug likeness (QED) of the molecules reaches 65.37, the molecular weight (MW) is 321.55, and the octanol-water partition coefficient (logP) is 4.47. Moreover, 99.96% of the generated drugs are novel. In addition, the paper introduces related work such as ReLeaSE, Reinvent, and how to improve drug target generation through ensemble learning and reward mechanisms. The experimental part describes in detail the entire process from dataset selection, model construction, reinforcement learning fine-tuning to reward function optimization, and demonstrates improvements in chemical properties and molecular novelty. In summary, the paper aims to improve the efficiency and success rate of new drug design by using deep learning techniques, particularly reinforcement learning, to address the challenges of time and cost in traditional drug discovery processes.