PharmacoNet: Accelerating Large-Scale Virtual Screening by Deep Pharmacophore Modeling

Seonghwan Seo,Woo Youn Kim
2023-12-18
Abstract:As the size of accessible compound libraries expands to over 10 billion, the need for more efficient structure-based virtual screening methods is emerging. Different pre-screening methods have been developed for rapid screening, but there is still a lack of structure-based methods applicable to various proteins that perform protein-ligand binding conformation prediction and scoring in an extremely short time. Here, we describe for the first time a deep-learning framework for structure-based pharmacophore modeling to address this challenge. We frame pharmacophore modeling as an instance segmentation problem to determine each protein hotspot and the location of corresponding pharmacophores, and protein-ligand binding pose prediction as a graph-matching problem. PharmacoNet is significantly faster than state-of-the-art structure-based approaches, yet reasonably accurate with a simple scoring function. Furthermore, we show the promising result that PharmacoNet effectively retains hit candidates even under the high pre-screening filtration rates. Overall, our study uncovers the hitherto untapped potential of a pharmacophore modeling approach in deep learning-based drug discovery.
Biomolecules,Machine Learning
What problem does this paper attempt to address?
The main objective of this paper is to propose a new deep learning framework—PharmacoNet, aimed at accelerating the process of large-scale virtual screening (VS). Specifically, PharmacoNet aims to address the following key issues: 1. **Improving Virtual Screening Efficiency**: With the dramatic increase in the size of accessible compound libraries, reaching the scale of billions, traditional structure-based virtual screening methods become inefficient in handling such large compound libraries. Therefore, more efficient methods need to be developed to screen these compounds. 2. **Application of Structure-Based Methods**: Existing rapid prescreening methods often lack effective applicability to multiple proteins and struggle to complete protein-ligand binding conformation prediction and scoring in a very short time. PharmacoNet aims to improve this situation through deep learning technology. 3. **Binding Conformation Prediction and Scoring**: Traditional molecular docking methods require generating a large number of initial ligand conformations' binding poses and using complex physical energy equations for scoring, leading to high computational costs. PharmacoNet simplifies the model by using pharmacophore-based rapid evaluation. 4. **Maintaining Hit Candidates**: Under high filtration rates, traditional prescreening methods may miss potential hit candidates. PharmacoNet has designed a new method that effectively retains these candidates, ensuring a high hit rate even under high filtration rates. To achieve the above goals, PharmacoNet adopts the following strategies: - **Pharmacophore Modeling**: Transforming the pharmacophore modeling problem into an instance segmentation problem to determine protein hotspots and their corresponding pharmacophore locations. This helps in quickly identifying pharmacophore features related to protein function. - **Binding Pose Prediction**: Transforming the protein-ligand binding pose prediction problem into a graph matching problem, thereby avoiding complex atomic-level calculations. - **Scoring Function**: Introducing a new scoring function based on pharmacophore distance probability to evaluate the matching degree between the ligand and the pharmacophore model. Experimental results show that PharmacoNet significantly improves speed compared to traditional molecular docking methods while maintaining reasonable accuracy. Additionally, PharmacoNet effectively retains hit candidates in large-scale virtual screening, demonstrating good performance even under high filtration rates.