Retrieval Augmented Docking using Hierarchical Navigable Small Worlds

Michael Keiser,Brendan Hall
DOI: https://doi.org/10.26434/chemrxiv-2024-qsdd1
2024-04-24
Abstract:Make-on-demand chemical libraries have drastically increased the reach of molecular docking, with the enumerated ready-to-dock ZINC library approaching 5 billion molecules. While ever-growing libraries result in better-scoring molecules, the computational resources required to dock all of ZINC make this endeavor infeasible for most. Here, we organize and traverse chemical space with hierarchical navigable small world graphs, a method we term retrieval augmented docking (RAD). RAD recovers most virtual actives despite docking only a fraction of the library. Furthermore, RAD is protein-agnostic, supporting screens against many targets without additional computational overhead. In depth, we assess RAD on published large-scale docking campaigns against D4 and AmpC spanning 99.5 million and 138 million molecules, respectively. RAD recovers 95% of DOCK virtual actives for both targets after evaluating only 10% of the libraries. In breadth, RAD shows widespread applicability against 43 DUDE-Z proteins, evaluating 50.3 million associations. On average, RAD recovers 87% of virtual actives while docking 10% of the library without sacrificing chemical diversity.
Chemistry
What problem does this paper attempt to address?