Practically significant method comparison protocols for machine learning in small molecule drug discovery.

Cas Wognum,Jeremy R. Ash,Raquel Rodríguez-Pérez,Matteo Aldeghi,Alan C. Cheng,Djork-Arné Clevert,Ola Engkvist,Cheng Fang,Daniel J. Price,Jacqueline M. Hughes-Oliver,W. Patrick Walters
DOI: https://doi.org/10.26434/chemrxiv-2024-6dbwv
2024-11-04
Abstract:Machine Learning (ML) methods that relate molecular structure to properties are frequently proposed as in-silico surrogates for expensive or time-consuming experiments. In small molecule drug discovery, such methods inform high-stakes decisions like compound synthesis and in-vivo studies. This application lies at the intersection of multiple scientific disciplines. When comparing new ML methods to baseline or state-of-the-art approaches, statistically rigorous method comparison protocols and domain-appropriate performance metrics are essential to ensure replicability and ultimately the adoption of ML in small molecule drug discovery. This paper proposes a set of guidelines to incentivize rigorous and domain-appropriate techniques for method comparison tailored to small molecule property modeling. These guidelines, accompanied by annotated examples and open-source software tools, lay a foundation for robust ML benchmarking and thus the development of more impactful methods.
Chemistry
What problem does this paper attempt to address?