MolScore: A scoring and evaluation framework for de novo drug design

Morgan Thomas,Noel M. O'Boyle,Andreas Bender,Chris de Graaf
DOI: https://doi.org/10.26434/chemrxiv-2023-c4867-v2
2024-03-05
Abstract:MolScore is an open-source Python framework for scoring and evaluating molecules in the context of goal-directed generative models as used in de novo drug design. MolScore includes many relevant scoring functions for de novo drug design such as molecular similarity, docking software, predictive models, and synthesizability, as well as commonly used performance metrics to evaluate generative model performance based on chemistry generated. Integration into an existing generative model framework is simple, requiring just three lines of code, and graphical user interfaces are provided for objective configuration and for monitoring de novo molecules generated. As a real-world demonstration of its use, we use it to design selective 5-HT2a ligands using 266 pre-trained off-target predictive models, as well as docking into two co-crystal structures. MolScore can also be used for generative model evaluation as we demonstrate by analysing and selecting fine-tuning epochs of an RNN-based generative model. Moreover, the use of configuration files allows the sharing of objectives within the community for the purposes of reproducibility, comparison, and benchmarking; making it easier to propose drug discovery relevant objective functions as benchmark tasks. The code is freely available and hosted on GitHub, https://github.com/MorganCThomas/MolScore.
Chemistry
What problem does this paper attempt to address?
The paper aims to address the issues of scoring and evaluation in generative models within the field of drug design. Specifically: 1. **Provide an open-source framework**: MolScore is an open-source Python framework for molecular scoring and evaluation in goal-directed generative models. It includes many relevant scoring functions, such as molecular similarity, docking software, predictive models, and synthetic feasibility. 2. **Integrate multiple functionalities**: MolScore integrates commonly used performance metrics to evaluate the performance of generative models and can be easily integrated into existing generative model frameworks. Integration can be completed with just 3 lines of code. 3. **User interface**: A graphical user interface (GUI) is provided for configuring objective functions and monitoring generated molecules. This allows users to conveniently set scoring criteria and view results in real-time. 4. **Practical application example**: The paper demonstrates the application of MolScore through a practical case of designing selective 5-HT2a ligands. Additionally, it analyzes the fine-tuning cycles of a generative model based on recurrent neural networks (RNN). 5. **Standardization and reproducibility**: Through the use of configuration files, researchers within the community can share objective functions, thereby promoting reproducibility and comparative studies. Overall, MolScore aims to address the issues in the scoring and evaluation process of existing generative models in drug design, providing a flexible and extensible tool to help researchers better optimize and evaluate generated molecules.