Bohrium
robot
新建

空间站广场

论文
Notebooks
比赛
课程
Apps
我的主页
我的Notebooks
我的论文库
我的足迹

我的工作空间

任务
节点
文件
数据集
镜像
项目
数据库
公开
Multi-Conformation Docking
Uni-Dock
分子对接
docking
Uni-Dock分子对接docking
zhengh@dp.tech
发布于 2023-12-05
推荐镜像 :unidock-tools:v1.1.0_ipykernel
推荐机型 :c12_m92_1 * NVIDIA V100
赞 3
2
Uni-Dock-TestData(v6)

Multi-Conformation Docking: Enhanced Conformational Sampling and Multi-Stage Docking for Improved Docking Accuracy


Author: Hang Zheng
Create Time:2023-12-06
License:BY-NC-SA 4.0

代码
文本

Introduction

Workflow

Traditional molecular docking methods typically involve positioning the ligand molecule within the binding pocket of the target protein and then sampling its rotations, translations, and torsions of rotatable bonds to identify the optimal binding pose. However, due to computational resource constraints and the vast search space resulting from 3D continuity, these methods often assess only a subset of possible conformational combinations. Consequently, this can lead to suboptimal docking results in some protein-ligand complex systems.

Multi-Conformation Docking (mcdock) addresses this limitation by advancing the conformational search process into the molecule preparation phase, thereby artificially ensuring a more comprehensive coverage of the search space in an attempt to enhance the reliability of molecular docking. It consists of three steps: optional Conformation Generation, Rigid Docking, and Local Refinement.

Advantages of MultiConfDock

  1. Comprehensive Conformation Generation: confgen can rapidly generate a diverse array of low-energy conformations, ensuring that the search space encompasses as many ligand conformations as possible, increasing the likelihood of identifying suitable binding poses.

  2. Efficient Rigid Docking: The method is capable of swiftly evaluating a vast number of ligand conformations for their relative positions and orientations within the protein's binding pocket, ensuring a thorough coverage of the search space for each ligand conformation.

  3. Refined Local Refinement: MultiConfDock allows for minor local movements of the ligand to fine-tune the docking pose, ensuring that each binding pose is at a locally optimized structure.

This workflow is not only efficient but also powerful in predicting the optimal protein-ligand binding complexes.

代码
文本

Step-by-Step Implementation of Multi-Conformation Docking

Firstly, let's check the performace of Uni-Dock on redocking task of 6X8D from PoseBuster.

代码
文本
已隐藏单元格
已隐藏输出
代码
文本
已隐藏单元格
已隐藏输出
代码
文本
已隐藏单元格
PDB Code: 6X8D
Min RMSD (Top 1): 3.329471638113171
Min RMSD (Top 3): 3.329471638113171
Min RMSD (Top 5): 3.329471638113171
Min RMSD (Top 10): 3.329471638113171
代码
文本

Step 0. Conformation Generation

This is the optional zeroth step in the MultiConfDock process. In this phase, MultiConfDock generates multiple conformations of the ligand. This is achieved using a conformation generation algorithm known as confgen. confgen employs CDPKit that can efficiently generate a large number of ligand conformations. This step is optional, and if the user already has the conformations of the ligand, he can skip this step.

代码
文本
[4]
import os, subprocess

# prepare command
cmd = ["confgen"]
cmd += ["-i", f"{pdbid}_ligand.smi"]
cmd += ["-o", f"{pdbid}_ligand_confgen.sdf"]
cmd += ["-C", "LARGE_SET_DIVERSE"]
cmd += ["-v", "ERROR"]
cmd += ["-T", "60"] # Time limit in seconds
cmd += ["-n", "200"] # max number of conformers
cmd += ["-r", "0.3"] # min RMSD between conformers

# run confgen
status = subprocess.run(cmd,
stdout=subprocess.PIPE, stderr=subprocess.PIPE)

with open(f"{pdbid}_ligand_confgen.sdf", "r") as f:
ntotal = len([l for l in f.read().split("$$$$\n") if l])
print(f"{ntotal} conformations are generated by ConfGen for {pdbid} ligand.")
5 conformations are generated by ConfGen for 6X8D ligand.
代码
文本

Step 1. Rigid Docking

The first step in the process is RigidDock. In this phase, MultiConfDock performs a rigid docking of each ligand conformation against the target protein. This means that the ligand and the protein are treated as rigid bodies, and only their relative positions and orientations change. This step is computationally efficient and allows MultiConfDock to quickly evaluate a large number of ligand conformations.

代码
文本
已隐藏单元格
已隐藏输出
代码
文本
已隐藏单元格
代码
文本
已隐藏单元格
代码
文本
[8]
import os, subprocess, json

with open("docking_grid.json", "r") as f:
box = json.load(f)

# prepare command
cmd = ["unidock"]
cmd += ["--receptor", f"{pdbid}_receptor.pdbqt"]
cmd += ["--ligand_index", "ligand_index.txt"]
cmd += ["--center_x", str(box["center_x"])]
cmd += ["--center_y", str(box["center_y"])]
cmd += ["--center_z", str(box["center_z"])]
cmd += ["--size_x", str(box["size_x"])]
cmd += ["--size_y", str(box["size_y"])]
cmd += ["--size_z", str(box["size_z"])]
cmd += ["--scoring", "vina"]
cmd += ["--dir", f"{pdbid}_rigiddock_output"]
cmd += ["--exhaustiveness", "128"]
cmd += ["--max_step", "20"]
cmd += ["--num_modes", "3"]
cmd += ["--verbosity", "2"]
cmd += ["--refine_step", "5"]
cmd += ["--keep_nonpolar_H"]

print(" ".join(cmd))

# run rigiddock
os.makedirs(f"{pdbid}_rigiddock_output", exist_ok=True)
status = subprocess.run(cmd,
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
# print(status.stderr)

# read score and rank
read_score_and_rank(
f"{pdbid}_rigiddock_output",
f"{pdbid}_localrefine_input",
pdbid,
f"{pdbid}_rigid_docking",
True,
100
)
unidock --receptor 6X8D_receptor.pdbqt --ligand_index ligand_index.txt --center_x 22.7196 --center_y -0.9656300000000002 --center_z 70.91402999999998 --size_x 13.529800000000002 --size_y 14.329699999999999 --size_z 12.241900000000001 --scoring vina --dir 6X8D_rigiddock_output --exhaustiveness 128 --max_step 20 --num_modes 3 --verbosity 2 --refine_step 5 --keep_nonpolar_H
代码
文本

Step 2. Local Refinement

After the RigidDock phase, MultiConfDock proceeds to the LocalRefine step. In this phase, the top scoring conformations from the RigidDock step are selected, and a local refinement is performed. This involves allowing small, local movements of the ligand and the protein to fine-tune the docking pose. This step is more computationally intensive, but it is applied only to a subset of the initial conformations, making the process efficient.

代码
文本
已隐藏单元格
代码
文本
[10]
import os, subprocess

# prepare command
cmd = ["unidock"]
cmd += ["--receptor", f"{pdbid}_receptor.pdbqt"]
cmd += ["--ligand_index", "ligand_index.txt"]
cmd += ["--center_x", str(box["center_x"])]
cmd += ["--center_y", str(box["center_y"])]
cmd += ["--center_z", str(box["center_z"])]
cmd += ["--size_x", str(box["size_x"])]
cmd += ["--size_y", str(box["size_y"])]
cmd += ["--size_z", str(box["size_z"])]
cmd += ["--scoring", "vina"]
cmd += ["--dir", f"{pdbid}_localrefine_output"]
cmd += ["--exhaustiveness", "512"]
cmd += ["--max_step", "40"]
cmd += ["--num_modes", "1"]
cmd += ["--verbosity", "2"]
cmd += ["--refine_step", "5"]
cmd += ["--keep_nonpolar_H"]
cmd += ["--local_only"]

print(" ".join(cmd))

# run localrefine
os.makedirs(f"{pdbid}_localrefine_output", exist_ok=True)
status = subprocess.run(cmd,
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
print(status.stderr)

# read score and rank
read_score_and_rank(
f"{pdbid}_localrefine_output",
f"{pdbid}_mcdock_result",
pdbid,
f"{pdbid}_mcdock",
False
)
unidock --receptor 6X8D_receptor.pdbqt --ligand_index ligand_index.txt --center_x 22.7196 --center_y -0.9656300000000002 --center_z 70.91402999999998 --size_x 13.529800000000002 --size_y 14.329699999999999 --size_z 12.241900000000001 --scoring vina --dir 6X8D_localrefine_output --exhaustiveness 512 --max_step 40 --num_modes 1 --verbosity 2 --refine_step 5 --keep_nonpolar_H --local_only
b''
代码
文本

Evaluation: Calculate the RMSD between the crystal structure and docked structure from mcdock

代码
文本
[12]
rmsd = calc_rmsd(f"{pdbid}_ligand_ori.sdf", f"{pdbid}_mcdock_result/{pdbid}_ligand_{pdbid}_mcdock.sdf")
print(f"PDB Code: {pdbid}")
for topn in [1,3,5,10]:
print(f"Min RMSD (Top {topn}): {min(rmsd[:topn])}")
PDB Code: 6X8D
Min RMSD (Top 1): 1.3844682036796658
Min RMSD (Top 3): 1.3844682036796658
Min RMSD (Top 5): 1.3844682036796658
Min RMSD (Top 10): 1.3844682036796658
[15:05:47] ERROR: Problems encountered parsing data fields
[15:05:47] ERROR: moving to the beginning of the next molecule
[15:05:47] ERROR: Problems encountered parsing data fields
[15:05:47] ERROR: moving to the beginning of the next molecule
[15:05:47] ERROR: Problems encountered parsing data fields
[15:05:47] ERROR: moving to the beginning of the next molecule
[15:05:47] ERROR: Problems encountered parsing data fields
[15:05:47] ERROR: moving to the beginning of the next molecule
[15:05:47] ERROR: Problems encountered parsing data fields
[15:05:47] ERROR: moving to the beginning of the next molecule
[15:05:47] ERROR: Problems encountered parsing data fields
[15:05:47] ERROR: moving to the beginning of the next molecule
[15:05:47] ERROR: Problems encountered parsing data fields
[15:05:47] ERROR: moving to the beginning of the next molecule
代码
文本

By combining these steps, MultiConfDock can efficiently identify the optimal protein-ligand binding complexes. This makes it a powerful tool for researchers in the field of drug design.

代码
文本

mcdock: An Automated Workflow for Multi-Conformation Docking

mcdock is now open source

https://github.com/dptech-corp/Uni-Dock/tree/mcdock/unidock_tools/MultiConfDock

代码
文本

Installation

mcdock is now available using Docker:

docker pull dptechnology/unidock_tools:latest

After the image is pulled, you can run a Docker container using the following command:

docker run --gpus 0 -dit --name mcdock dptechnology/unidock_tools:latest
docker attach mcdock

Usage

mcdock is controlled via several command-line parameters:

unidocktools mcdock --help

Here's a brief overview:

Required Arguments

  • -r, --receptor: Path to the receptor file in PDBQT format.
  • -l, --ligands: Path to the ligand file in SDF format. For multiple files, separate them by commas.
  • -i, --ligand_index: A text file containing the path of ligand files in sdf format.

ConfGen Arguments

  • -g, --gen_conf: Whether to generate conformers for the ligands (default: False).
  • -n, --max_num_confs_per_ligand: Maximum number of conformers to generate for each ligand (default: 1000).
  • -m, --min_rmsd: Minimum RMSD for output conformer selection (default: 0.5000, must be >= 0, 0 disables RMSD checking).

Docking Box Parameters

  • -cx, --center_x: X-coordinate of the docking box center.
  • -cy, --center_y: Y-coordinate of the docking box center.
  • -cz, --center_z: Z-coordinate of the docking box center.
  • -sx, --size_x: Width of the docking box in the X direction (default: 22.5).
  • -sy, --size_y: Width of the docking box in the Y direction (default: 22.5).
  • -sz, --size_z: Width of the docking box in the Z direction (default: 22.5).

Directory

  • -wd, --workdir: Working directory (default: 'MultiConfDock').
  • -sd, --savedir: Save directory (default: 'MultiConfDock-Result').
  • -bs, --batch_size: Batch size for mcdock (default: 20).

Rigid Docking Parameters

  • -sf_rd, --scoring_function_rigid_docking: Scoring function used in rigid docking (default: 'vina').
  • -ex_rd, --exhaustiveness_rigid_docking: exhaustiveness used in rigid docking (default: 128).
  • -ms_rd, --maxstep_rigid_docking: maxstep used in rigid docking (default: 20)
  • -nm_rd, --num_modes_rigid_docking: Number of modes used in rigid docking (default: 3).
  • -rs_rd, --refine_step_rigid_docking: Refine step used in rigid docking (default: 3).
  • -topn_rd, --topn_rigid_docking: Top N results used in rigid docking (default: 100).

Local Refine Parameters

  • -sf_lr, --scoring_function_local_refine: Scoring function used in local refine (default: 'vina').
  • -ex_lr, --exhaustiveness_local_refine: exhaustiveness used in local refine (default: 32)
  • -ms_lr, --maxstep_local_refine: maxstep used in local refine (default: 40)
  • -nm_lr, --num_modes_local_refine: Number of modes used in local refine (default: 1).
  • -rs_lr, --refine_step_local_refine: Refine step used in local refine (default: 5).
  • -topn_lr, --topn_local_refine: Top N results used in local refine (default: 1).

These parameters allow you to control the behavior of mcdock and customize it to suit your specific needs.

代码
文本

Example

代码
文本
[14]
import os, json, subprocess


with open("docking_grid.json", "r") as f:
box = json.load(f)
cmd = ["unidocktools", "mcdock"]
cmd += ["--receptor", f"{pdbid}_receptor.pdbqt"]
cmd += ["--ligands", f"{pdbid}_ligand_ori.sdf"]
cmd += ["--center_x", str(box['center_x']), "--center_y", str(box['center_y']), "--center_z", str(box['center_z'])]
cmd += ["--size_x", str(box['size_x']), "--size_y", str(box['size_y']), "--size_z", str(box['size_z'])]
cmd += ["--gen_conf"]
cmd += ["--max_num_confs_per_ligand", "200"]
cmd += ["--min_rmsd", "0.3"]
cmd += ["--workdir", "MCDOCK_work"]
cmd += ["--savedir", "MCDOCK_result"]
cmd += ["--exhaustiveness_rigid_docking", "128"]
cmd += ["--max_step_rigid_docking", "20"]
cmd += ["--topn_rigid_docking", "100"]
cmd += ["--num_modes_rigid_docking", "3"]
cmd += ["--exhaustiveness_local_refine", "512"]
cmd += ["--max_step_local_refine", "40"]
cmd += ["--topn_local_refine", "10"]
cmd += ["--num_modes_local_refine", "1"]

print(" ".join(cmd))
status = subprocess.run(cmd,
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
status.returncode
unidocktools mcdock --receptor 6X8D_receptor.pdbqt --ligands 6X8D_ligand_ori.sdf --center_x 22.7196 --center_y -0.9656300000000002 --center_z 70.91402999999998 --size_x 13.529800000000002 --size_y 14.329699999999999 --size_z 12.241900000000001 --gen_conf --max_num_confs_per_ligand 200 --min_rmsd 0.3 --workdir MCDOCK_work --savedir MCDOCK_result --exhaustiveness_rigid_docking 128 --max_step_rigid_docking 20 --topn_rigid_docking 100 --num_modes_rigid_docking 3 --exhaustiveness_local_refine 512 --max_step_local_refine 40 --topn_local_refine 10 --num_modes_local_refine 1
0
代码
文本
Uni-Dock
分子对接
docking
Uni-Dock分子对接docking
已赞3
本文被以下合集收录
分子对接
小炒砂糖桔
更新于 2023-12-18
6 篇0 人关注
ML Docking
ericalcaide1@gmail.com
更新于 2023-12-06
1 篇0 人关注
推荐阅读
公开
Uni-Dock高性能分子对接引擎 - 使用案例
Uni-Dock分子对接化学信息学中文
Uni-Dock分子对接化学信息学中文
zhengh@dp.tech
发布于 2023-06-18
10 赞30 转存文件1 评论
公开
LLM for Science:从 Prompt Engineering 到 AI Agents
LLMChatGPT Promptchatgpt agentAI4S
LLMChatGPT Promptchatgpt agentAI4S
昌珺涵
发布于 2023-11-27
2 赞1 转存文件