How good are current pocket based 3D generative models? : The benchmark set and evaluation on protein pocket based 3D molecular generative models

Ting Ran,Haoyang Liu,Yifei Qin,Zhangming Niu,Mingyuan Xu,Jiaqiang Wu,Xianglu Xiao,Jinping Lei,Hongming Chen
DOI: https://doi.org/10.26434/chemrxiv-2024-2qgpb
2024-08-12
Abstract:The development of three-dimensional (3D) molecular generative model based on protein pockets has recently attracted a lot of attentions. This type of model aims to achieve the simultaneous generation of molecular graph and 3D binding conformation under the constraint of protein binding. Various pocket based generative models have been proposed, however, currently there is a lack of systematic and objective evaluation metrics for these models. To address this issue, a comprehensive benchmark dataset, named as POKMOL-3D, is proposed to evaluate protein pocket based 3D molecular generative models. It includes 32 protein targets together with their known active compounds as a test set to evaluate the versatility of generation models to mimick the real-world scenario. Additionally, a series of 2D and 3D evaluation metrics was integrated to assess the quality of generated molecular structures and their binding conformations. It is expected that this work can enhance our comprehension of the effectiveness and weakness of current 3D generative models, and stimulate the discussion on challenges and useful guidance for developing next wave of molecular generative models.
Chemistry
What problem does this paper attempt to address?
The main focus of this paper is the quality assessment and comparison of current 3D molecule generation models based on protein pockets. Specifically, the research team created a comprehensive benchmark dataset (named POKMOL-3D) to evaluate the 3D molecule generation capabilities of these models under protein pocket constraints. The core issues addressed by the paper include: 1. **Lack of unified evaluation standards**: Although various 3D molecule generation models based on protein pockets have been proposed, there is currently a lack of a systematic and objective evaluation system to assess the performance of these models. 2. **Model quality evaluation**: By proposing a series of evaluation metrics, including sampling speed and target failure rate, to assess the overall performance of the models. 3. **Molecular structure quality**: Using multiple 2D and 3D evaluation metrics to assess the quality of the generated molecular structures. 4. **Active molecule recovery capability**: Evaluating the similarity between the generated molecules and known active molecules to measure the model's ability to generate potential active compounds. 5. **Target binding-related scoring**: Assessing the ability of the generated molecules to bind to specific protein pockets, including in situ scoring without further pose optimization and rescoring after redocking. In summary, this paper aims to provide a systematic evaluation framework for 3D molecule generation models based on protein pockets by constructing a comprehensive benchmark dataset and a series of evaluation metrics, thereby promoting the development and technological advancement in this field.