Bohrium
robot
新建

空间站广场

论文
Notebooks
比赛
课程
Apps
我的主页
我的Notebooks
我的论文库
我的足迹

我的工作空间

任务
节点
文件
数据集
镜像
项目
数据库
公开
基于unimol bias 以及 unidock的分子对接
RDKit
分子对接
Deep Learning
Uni-Mol
Uni-Dock
RDKit分子对接Deep LearningUni-MolUni-Dock
yuanfb@dp.tech
发布于 2023-07-31
推荐镜像 :unimol-qsar:0703
推荐机型 :c16_m62_1 * NVIDIA T4
赞 1
4
基于unimol bias 以及 unidock的分子对接
目标:
背景:
目录:
Uni-Mol
Uni-Dock
引言
先下载计算需要的数据,模型和软件吧!
然后我们需要载入相关的库
双击即可修改
代码
文本

基于unimol bias 以及 unidock的分子对接

🎯 本教程旨在快速在分子对接中上手使用 unidock 和 unimol

  • 教材包中包含所有输入输出数据集合和用到的unimol模型。

  • 文档中提供了详细的说明,使得上手过程更加容易。

Bohrium Notebook 界面,你可以点击界面上方蓝色按钮 开始连接,选择 unimol-qsar:0703 镜像及任何一款GPU节点配套,稍等片刻即可运行。

目标:

本文档旨在详细怎么基于unimol 在分子对接中应用,以及如何基于unidock bias功能,定制自己想要的docking结果。

背景:


  • **你需要提前掌握:**
    分子对接基本知识和Python

目录:

  1. 引言:unimol简介,unidock简介
  2. unimol 推理出分子,只使用unidock得到分子,
  3. 结合unimol和unidock结果
  4. 结论与展望:总结本文档的主要内容
代码
文本

Uni-Mol

Uni-Mol是深势科技于2022年5月发布的一款基于分子三维结构的通用分子表征学习框架。Uni-Mol将分子三维结构作为模型输入,并使用约2亿个小分子构象和300万个蛋白表面空腔结构,使用原子类型还原和原子坐标还原两种自监督任务对模型进行预训练。

从三维信息出发的表征学习和有效的预训练方案让 Uni-Mol 在几乎所有与药物分子和蛋白口袋相关的下游任务上都超越了 SOTA(state of the art),也让 Uni-Mol 得以能够直接完成分子构象生成、蛋白-配体结合构象预测等三维构象生成相关的任务,并超越现有解决方案。论文被机器学习顶会ICLR 2023接收。

Uni-Dock

Uni-Dock是深势科技于2022年发布的一款高性能分子对接引擎,基于Autodock-Vina实现了基于GPU的高性能计算优化。

代码
文本

引言

分子对接(Quantitative Structure-Activity Relationship,QSAR)Molecular Docking)是计算生物学和计算化学领域的一种研究方法,用于模拟并预测分子之间的相互作用。这一方法主要应用于研究药物分子与靶标蛋白质之间的结合过程,以寻找具有治疗潜力的药物分子。

分子对接的主要步骤包括以下几个阶段:

  1. 准备阶段: a. 靶标蛋白质准备:选择并获取靶标蛋白质的三维结构信息,通常来源于蛋白质数据库如PDB(Protein Data Bank)。需要对蛋白质进行预处理,包括去除水分子、添加缺失的原子或侧链、优化氢键网络等。 b. 配体准备:获取待筛选药物分子的三维结构信息,并进行预处理,包括优化几何构型、计算电荷分布、确定可能的构象异构体等。 c. 定义结合口袋:确定靶标蛋白质表面的结合区域,可以基于已知的配体结合位点或使用计算方法预测潜在的结合口袋。

  2. 对接阶段: a. 搜索算法:应用搜索算法探索配体在结合口袋内的所有可能结合姿态。搜索算法的选择和参数设置对分子对接结果具有重要影响。常见的搜索算法有蒙特卡罗法、模拟退火法、遗传算法等。 b. 打分函数:对每个生成的配体结合姿态进行评估,计算其结合能和亲和力。打分函数通常考虑分子间的范德华力、静电相互作用、氢键作用等因素。合适的打分函数可以提高对接结果的准确性和可靠性。

  3. 结果分析阶段: a. 对接结果筛选:根据打分函数的评分,对生成的结合姿态进行排序,筛选出具有较高亲和力的候选结构。 b. 结果验证:通过比较实验数据或已知的配体结构,验证对接结果的可靠性。可以采用如RMSD(Root Mean Square Deviation)等指标衡量对接结果与实验结构的一致性。 c. 生物活性预测:基于对接结果,分析药物分子与靶标蛋白质的相互作用,预测药物分子的生物活性和作用机制。

  4. 进一步优化(可选): a. 药物分子优化:根据对接结果,对药物分子进行化学修饰,提高其亲和力和选择性。 b. 蛋白质识别优化:对靶标蛋白质进行定向进化或定点突变,以提高与药物分子的结合特异性。

代码
文本

Uni-Mol 可以直接在蛋白口袋中生成合适的分子

先下载计算需要的数据,模型和软件吧!

为了带领大家更好地学习和体验构建unimol在分子对接中的应用,我们用CASF-2016数据集:

我们可以首先下载CASF-2016数据集:

论文地址:https://pubs.acs.org/doi/10.1021/acs.jcim.8b00545

代码
文本
[ ]
%%bash
# 选择镜像:unimol-qsar:unimolxxlatest, 机型选择GPU
# 拉取数据、代码和模型权重
# rm -rf CASF-2016.tar.gz
# rm -rf CASF-2016
# rm -rf binding_pose_220908.pt
# rm -rf docking.tar.gz
# rm -rf docking
wget https://github.com/dptech-corp/Uni-Dock/releases/download/1.1.0/unidock
wget -nv https://bohrium-example.oss-cn-zhangjiakou.aliyuncs.com/unimol-docking/CASF-2016.tar.gz
# wget -nv https://bohrium-example.oss-cn-zhangjiakou.aliyuncs.com/unimol-docking/binding_pose_220908.pt

https://bohrium-example.oss-cn-zhangjiakou.aliyuncs.com/unimol-models/checkpoint_best_20230710.pt
wget -nv https://bohrium-example.oss-cn-zhangjiakou.aliyuncs.com/unimol-docking/docking.tar.gz
tar --no-same-owner -xzf "CASF-2016.tar.gz"
tar --no-same-owner -xzf "docking.tar.gz"
cd docking
python setup.py install
代码
文本

然后我们需要载入相关的库

代码
文本
[1]
# 载入相关的库
# !pip install pandas==1.3.1
from unimol_docking.utils.docking_utils import (
docking_data_pre,
ensemble_iterations,
# write_lmdb,
generate_docking_input
)
from tqdm import tqdm
from rdkit import Chem
import subprocess
import py3Dmol
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import pickle
import json
import copy
import os
from ipywidgets import widgets,Layout,HBox,VBox
import lmdb
from biopandas.pdb import PandasPdb
import re
from rdkit.Chem import AllChem
from sklearn.cluster import KMeans
from rdkit.Chem import rdMolTransforms
from rdkit.Chem.rdMolAlign import AlignMolConformers
from tqdm import tqdm
import pickle
代码
文本
[22]
!jupyter labextension install jupyterlab_3dmol
(Deprecated) Installing extensions with the jupyter labextension install command is now deprecated and will be removed in a future major version of JupyterLab.

Users should manage prebuilt extensions with package managers like pip and conda, and extension authors are encouraged to distribute their extensions as prebuilt packages 
Building jupyterlab assets (production, minimized)
The extension "jupyterlab_tensorboard" is outdated.

The extension "jupyterlab-jupytext" is outdated.

代码
文本
[2]


def parser(pdb_id, smiles, seed=42):
pmol, pocket_residues = load_from_CASF(pdb_id)
# pmol, pocket_residues = load_from_pocket_json(pdb_id)
pname = pdb_id
pro_atom = pmol.df["ATOM"]
pro_hetatm = pmol.df["HETATM"]

pro_atom["ID"] = pro_atom["chain_id"].astype(str) + pro_atom[
"residue_number"
].astype(str)
pro_hetatm["ID"] = pro_hetatm["chain_id"].astype(str) + pro_hetatm[
"residue_number"
].astype(str)

pocket = pd.concat(
[
pro_atom[pro_atom["ID"].isin(pocket_residues)],
pro_hetatm[pro_hetatm["ID"].isin(pocket_residues)],
],
axis=0,
ignore_index=True,
)

pocket["normalize_atom"] = pocket["atom_name"].map(normalize_atoms)
pocket = pocket[pocket["normalize_atom"] != ""]
patoms = pocket["atom_name"].apply(normalize_atoms).values.tolist()
pcoords = [pocket[["x_coord", "y_coord", "z_coord"]].values]
side = [0 if a in main_atoms else 1 for a in patoms]
residues = (
pocket["chain_id"].astype(str) + pocket["residue_number"].astype(str)
).values.tolist()

# generate ligand conformation
M, N = 100, 10
mol = Chem.MolFromSmiles(smiles)
mol = Chem.AddHs(mol)
AllChem.EmbedMolecule(mol, randomSeed=seed)
latoms = [atom.GetSymbol() for atom in mol.GetAtoms()]
holo_coordinates = [mol.GetConformer().GetPositions().astype(np.float32)]
holo_mol = mol
coordinate_list = clustering_coords(mol, M=M, N=N, seed=seed, removeHs=False)
mol_list = [mol] * N

return pickle.dumps(
{
"atoms": latoms,
"coordinates": coordinate_list,
"mol_list": mol_list,
"pocket_atoms": patoms,
"pocket_coordinates": pcoords,
"side": side,
"residue": residues,
"holo_coordinates": holo_coordinates,
"holo_mol": holo_mol,
"holo_pocket_coordinates": pcoords,
"smi": smiles,
"pocket": pname,
},
protocol=-1,
)


def write_lmdb(data_path, pdb_id, smiles_list, seed=42):
# os.makedirs(data_path, exist_ok=True)
outputfilename = os.path.join(data_path, pdb_id + ".lmdb")
try:
os.remove(outputfilename)
except:
pass
env_new = lmdb.open(
outputfilename,
subdir=False,
readonly=False,
lock=False,
readahead=False,
meminit=False,
max_readers=1,
map_size=int(10e9),
)
for i, smiles in enumerate(smiles_list):
inner_output = parser(pdb_id, smiles, seed=seed)
txn_write = env_new.begin(write=True)
txn_write.put(f"{i}".encode("ascii"), inner_output)
txn_write.commit()
env_new.close()
def load_from_CASF(pdb_id):
CASF_PATH = '/data/CASF-2016'
print(f"CASF_PATH:{CASF_PATH}, pdd_id:{pdb_id}")
pdb_path = os.path.join(CASF_PATH, "casf2016", pdb_id + "_protein.pdb")
print(f"pdb_path:{pdb_path}")
pmol = PandasPdb().read_pdb(pdb_path)
pocket_residues = json.load(
open(os.path.join(CASF_PATH, "casf2016.pocket.json"))
)[pdb_id]
return pmol, pocket_residues

# def load_from_pocket_json(pdb_id):
# work_path = '/data/yfb/13_ifd_score/ifd_0601_clean/'
# # print(f"CASF_PATH:{CASF_PATH}, pdd_id:{pdb_id}")
# pdb_path = os.path.join(work_path, pdb_id, "template_protein.pdb")
# print(f"pdb_path:{pdb_path}")
# pmol = PandasPdb().read_pdb(pdb_path)
# with open(os.path.join(work_path, pdb_id, "pocket_residues_info.json"), 'r') as f:
# pocket_json = json.load(f)
# pocket_residues = [ ii[1] for ii in pocket_json["template_pocket_residues"] ]
# return pmol, pocket_residues


def normalize_atoms(atom):
return re.sub("\d+", "", atom)

main_atoms = ["N", "CA", "C", "O", "H"]

def single_conf_gen(tgt_mol, num_confs=1000, seed=42, removeHs=True):
mol = copy.deepcopy(tgt_mol)
mol = Chem.AddHs(mol)
allconformers = AllChem.EmbedMultipleConfs(
mol, numConfs=num_confs, randomSeed=seed, clearConfs=True
)
sz = len(allconformers)
for i in range(sz):
try:
AllChem.MMFFOptimizeMolecule(mol, confId=i)
except:
continue
if removeHs:
mol = Chem.RemoveHs(mol)
return mol


def clustering_coords(mol, M=1000, N=100, seed=42, removeHs=True):
rdkit_coords_list = []
rdkit_mol = single_conf_gen(mol, num_confs=M, seed=seed, removeHs=removeHs)
noHsIds = [
rdkit_mol.GetAtoms()[i].GetIdx()
for i in range(len(rdkit_mol.GetAtoms()))
if rdkit_mol.GetAtoms()[i].GetAtomicNum() != 1
]
### exclude hydrogens for aligning
AlignMolConformers(rdkit_mol, atomIds=noHsIds)
sz = len(rdkit_mol.GetConformers())
for i in range(sz):
_coords = rdkit_mol.GetConformers()[i].GetPositions().astype(np.float32)
rdkit_coords_list.append(_coords)

### exclude hydrogens for clustering
rdkit_coords_flatten = np.array(rdkit_coords_list)[:, noHsIds].reshape(sz, -1)
ids = (
KMeans(n_clusters=N, random_state=seed)
.fit_predict(rdkit_coords_flatten)
.tolist()
)
coords_list = [rdkit_coords_list[ids.index(i)] for i in range(N)]
print(f"len(coords_list), {len(coords_list)}")
return coords_list
代码
文本
[19]
# docking函数,传入pdb_id 以及对应的smiles即可
def call_unimol_docking(pdb_id, smiles):
# data_path = "/data/yfb/13_ifd_score/ifd_0601_clean/"
# data_path = "./CASF-2016"
seed = 42
results_path = "./CASF-2016/unimolresults"
# results_path = data_path + 'unimolresults/'
os.makedirs(results_path, exist_ok=True)
smiles_list = smiles.split(",")
print("generate predict data...")
write_lmdb(data_path, pdb_id, smiles_list, seed=seed)
print("model inference...")
cmd = ' '.join(['export MKL_SERVICE_FORCE_INTEL=1',
'&&',
'python ./docking/unimol_docking/infer.py',
'--user-dir ./docking/unimol_docking {}'.format(data_path),
'--valid-subset {} --results-path {}'.format(pdb_id, results_path),
'--num-workers 8 --ddp-backend=c10d --batch-size 8 --task docking_pose',
# '--loss docking_pose --arch docking_pose --path ./yfb/unimol-models/checkpoint_best.pt',
'--loss docking_pose --arch docking_pose --path ./checkpoint_best_20230710.pt',
'--fp16 --fp16-init-scale 4 --fp16-scale-window 256',
'--dist-threshold 8.0 --recycling 3 --log-interval 50 --log-format simple',
])
os.system(cmd)
print("docking ...")
predict_file = os.path.join(results_path, pdb_id + ".out.pkl")
reference_file = os.path.join(data_path, pdb_id + ".lmdb")
generate_docking_input(
predict_file, reference_file, tta_times=10, output_dir=results_path
)
print("visualization ...")
for i, smiles in enumerate(smiles_list[:1]):
print("Docking {}".format(smiles))
input_path = os.path.join(results_path, "{}.{}.pkl".format(pdb_id, i))
ligand_path = os.path.join(results_path, "docking.{}.{}.sdf".format(pdb_id, i))
cmd = "export MKL_THREADING_LAYER=GNU"
cmd += "&& python ./docking/unimol_docking/utils/coordinate_model.py --input {} --output-ligand {}".format(
input_path, ligand_path
)
subprocess.call(cmd, shell=True)
# pdb_path = os.path.join(data_path, pdb_id, 'template_protein.pdb')
pdb_path = os.path.join(data_path, 'casf2016', pdb_id+'_protein.pdb')
unimol_ligand_path = os.path.join(data_path, "unimolresults","docking.{}.{}.sdf".format(pdb_id,0))
# gt_ligand_path = os.path.join(data_path, pdb_id, "target_align_pocket/", "target_ligand_align_pocket.sdf")
gt_ligand_path = os.path.join(data_path,'casf2016',pdb_id+'_ligand.sdf')
view = py3Dmol.view()
view.removeAllModels()
view.addModel(open(pdb_path,'r').read(),format='pdb')
view.setStyle({'cartoon': {'arrows':True, 'tubes':False, 'style':'oval', 'color':'white'}})
view.addSurface(py3Dmol.VDW,{'opacity':0.5,'color':'white'})

view.addModel(open(unimol_ligand_path,'r').read(),format='sdf')
ref_m = view.getModel()
ref_m.setStyle({},{'stick':{'colorscheme':'greenCarbon','radius':0.2}})

view.zoomTo(viewer=(100,0))
view.show()

view.removeAllModels()
view.addModel(open(unimol_ligand_path,'r').read(),format='sdf')
ref_m = view.getModel()
ref_m.setStyle({},{'stick':{'colorscheme':'greenCarbon','radius':0.2}})

view.addModel(open(gt_ligand_path,'r').read(),format='sdf')
ref_m = view.getModel()
ref_m.setStyle({},{'stick':{'colorscheme':'redCarbon','radius':0.2}})

view.zoomTo(viewer=(100,0))
view.show()
代码
文本
[4]
data_path = "./CASF-2016"
casf_collect = os.listdir(os.path.join(data_path, "casf2016"))
casf_collect = list(set([item[:4] for item in casf_collect]))

pdb_selector = widgets.SelectMultiple(
options=casf_collect,
# value=[np.random.choice(casf_collect)],
value=['3bgz'],
row=8,
description='请选择靶点:',
disable=False
)
pdb_label = widgets.Label(value='已选择靶点:')
smiles_label = widgets.Label(value='默认Dock分子: ')
smiles_text = widgets.Text(
description='选择其他分子: ',
layout=Layout(width='36%', height='30px'),
)
submiter = widgets.Button(
description='选择',
button_style='success',
layout=Layout(width='36%', height='30px'),
)
def btn_click(sender):
pdb_id = pdb_selector.value[0]
pdb_label.value='已选择靶点: '+ pdb_selector.value[0]
supp = Chem.SDMolSupplier(os.path.join(data_path, "casf2016", pdb_id + "_ligand.sdf"))
mol = [mol for mol in supp if mol][0]
ori_smiles = Chem.MolToSmiles(mol)
smiles_label.value = '默认Dock分子: '+ori_smiles
submiter.on_click(btn_click)
display(HBox([pdb_selector,
VBox([pdb_label,
smiles_label])]
)
)
display(submiter)
代码
文本

CASF-2016中有几百个体系,我们选择一个感兴趣的体系参与下面的计算吧!

我们可以通过上文的选择框选择3bgz体系。(默认选项)

1.点击选择

代码
文本
[18]
### docking ####
### 你可以指定自定义的smiles,传入smiles即可
pdb_id = '3bgz'
pdb_id = pdb_selector.value[0]
print(f" pdb_id:{pdb_id}")
smiles = smiles_label.value.split(': ')[-1]
print(f" smiles:{smiles}")
call_unimol_docking(pdb_id, smiles)
 pdb_id:3bgz
 smiles:O=C([O-])c1cccc2c(-c3ccccc3)c(-c3ccccc3)[nH]c12
generate predict data...
CASF_PATH:/data/CASF-2016, pdd_id:3bgz
pdb_path:/data/CASF-2016/casf2016/3bgz_protein.pdb
len(coords_list), 10
model inference...
2023-07-31 02:02:40 | INFO | unimol_docking.inference | loading model(s) from ./checkpoint_best_20230710.pt
2023-07-31 02:02:41 | INFO | unimol_docking.tasks.docking_pose | ligand dictionary: 30 types
2023-07-31 02:02:41 | INFO | unimol_docking.tasks.docking_pose | pocket dictionary: 9 types
2023-07-31 02:02:44 | INFO | unicore.tasks.unicore_task | get EpochBatchIterator for epoch 1
2023-07-31 02:02:46 | INFO | unimol_docking.inference | Done inference! 
docking ...
visualization ...
Docking O=C([O-])c1cccc2c(-c3ccccc3)c(-c3ccccc3)[nH]c12
3bgz-O=C([O-])c1cccc2c(-c3ccccc3)c(-c3ccccc3)[nH]c12-RMSD:41.4941-0.6726-0.2449
代码
文本

image.png 现在我们可以在图中看到unimol生成的分子的结果(绿色),以及真实的配体小分子的结构。

image.png 可以看出蛋白口袋中,小分子碳和一些环的位置预测地不错,但是苯环会发生怪异的扭曲。

这个 /data/CASF-2016/unimolresults/docking.3bgz.0.sdf 文件即为我们生成的配体文件。 确认下我们的配体文件确实存在:

接下来我们将在这一文件的基础上做docking 看看会不会有更好效果

代码
文本
[8]
!ls /data/CASF-2016/unimolresults/docking.3bgz.0.sdf
/data/CASF-2016/unimolresults/docking.3bgz.0.sdf
代码
文本
[9]
#%%

import glob

Vset = -0.3
r = 1.0

prefix = "/data/CASF-2016/unimolresults/"
# sdf_list = glob.glob(f"{prefix}/docking.????.0.sdf")
sdf_list = ["/data/CASF-2016/unimolresults/docking.3bgz.0.sdf"]

#%%
# sdf = 'docking.1a4w-1t4u.0.sdf'
# name = sdf
for sdf in sdf_list:
# output_path = './'
print(f"sdf:{sdf}")
outfilename = sdf + '.bpf'
with open(sdf, "r") as f:
# name = sdf[:4]
with open(outfilename, "w") as f1:
f1.write("x y z Vset r type atom\n")
f.readline()
f.readline()
f.readline()
f.readline()
line = f.readline()
# print(line)
while len(line) > 30:
f1.write("{} {} {} map {}\n".format(line[:31],str(Vset), str(r), line[31:34]))
line = f.readline()
sdf:/data/CASF-2016/unimolresults/docking.3bgz.0.sdf
代码
文本

我们基于 /data/CASF-2016/unimolresults/docking.3bgz.0.sdf sdf结构文件,得到了bpf 适用于unidock的偏置势文件,看看基于此文件分子对接结果如何。

代码
文本
[10]
cat /data/CASF-2016/unimolresults/docking.3bgz.0.sdf.bpf
x y z Vset r type atom
  -16.5314   34.6628    2.8863  -0.3 1.0 map O  
  -16.4491   35.6169    2.4126  -0.3 1.0 map C  
  -15.4169   35.4360    2.1430  -0.3 1.0 map O  
  -17.6787   36.4336    1.9018  -0.3 1.0 map C  
  -18.0797   37.5863    2.7068  -0.3 1.0 map C  
  -19.0807   38.3882    2.1391  -0.3 1.0 map C  
  -19.3606   38.4736    0.7471  -0.3 1.0 map C  
  -18.7343   37.3734   -0.0271  -0.3 1.0 map C  
  -19.2200   36.9148   -1.2768  -0.3 1.0 map C  
  -19.7396   37.6546   -2.4207  -0.3 1.0 map C  
  -19.2855   38.8043   -2.9196  -0.3 1.0 map C  
  -19.9556   39.5026   -3.9141  -0.3 1.0 map C  
  -21.0085   38.5427   -4.6085  -0.3 1.0 map C  
  -21.6424   38.4421   -3.4992  -0.3 1.0 map C  
  -21.0220   37.6831   -2.5066  -0.3 1.0 map C  
  -18.5236   35.6083   -1.4332  -0.3 1.0 map C  
  -18.1784   34.8441   -2.7558  -0.3 1.0 map C  
  -18.7585   33.7792   -3.0343  -0.3 1.0 map C  
  -19.0192   32.8389   -4.0223  -0.3 1.0 map C  
  -17.9312   33.0985   -5.0214  -0.3 1.0 map C  
  -16.9884   34.2934   -4.8915  -0.3 1.0 map C  
  -17.2103   35.0000   -3.6364  -0.3 1.0 map C  
  -18.5501   35.1683   -0.1151  -0.3 1.0 map N  
  -18.1485   36.3725    0.6703  -0.3 1.0 map C  
代码
文本
[11]
!apt-get update
!apt-get install openbabel -y
Get:1 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB]      
Hit:2 http://archive.ubuntu.com/ubuntu focal InRelease                         
Get:3 https://deb.nodesource.com/node_18.x focal InRelease [4583 B]            
Get:4 http://archive.ubuntu.com/ubuntu focal-updates InRelease [114 kB]        
Get:5 https://deb.nodesource.com/node_18.x focal/main amd64 Packages [775 B]   
Get:6 http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages [2909 kB]
Get:7 http://archive.ubuntu.com/ubuntu focal-backports InRelease [108 kB]
Get:8 http://security.ubuntu.com/ubuntu focal-security/restricted amd64 Packages [2597 kB]
Get:9 http://security.ubuntu.com/ubuntu focal-security/universe amd64 Packages [1085 kB]
Get:10 http://security.ubuntu.com/ubuntu focal-security/multiverse amd64 Packages [29.3 kB]
Get:11 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages [3389 kB]
Get:12 http://archive.ubuntu.com/ubuntu focal-updates/multiverse amd64 Packages [32.0 kB]
Get:13 http://archive.ubuntu.com/ubuntu focal-updates/restricted amd64 Packages [2738 kB]
Get:14 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 Packages [1383 kB]
Fetched 14.5 MB in 18s (811 kB/s)                                              
Reading package lists... Done
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  fontconfig-config fonts-dejavu-core libboost-iostreams1.71.0 libcairo2
  libfontconfig1 libfreetype6 libopenbabel6 libpixman-1-0
  libschroedinger-maeparser1 libx11-6 libx11-data libxau6 libxcb-render0
  libxcb-shm0 libxcb1 libxdmcp6 libxext6 libxrender1
The following NEW packages will be installed:
  fontconfig-config fonts-dejavu-core libboost-iostreams1.71.0 libcairo2
  libfontconfig1 libfreetype6 libopenbabel6 libpixman-1-0
  libschroedinger-maeparser1 libx11-6 libx11-data libxau6 libxcb-render0
  libxcb-shm0 libxcb1 libxdmcp6 libxext6 libxrender1 openbabel
0 upgraded, 19 newly installed, 0 to remove and 148 not upgraded.
Need to get 7173 kB of archives.
After this operation, 30.0 MB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu focal/main amd64 libxau6 amd64 1:1.0.9-0ubuntu1 [7488 B]
Get:2 http://archive.ubuntu.com/ubuntu focal/main amd64 libxdmcp6 amd64 1:1.1.3-0ubuntu1 [10.6 kB]
Get:3 http://archive.ubuntu.com/ubuntu focal/main amd64 libxcb1 amd64 1.14-2 [44.7 kB]
Get:4 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libx11-data all 2:1.6.9-2ubuntu1.5 [113 kB]
Get:5 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libx11-6 amd64 2:1.6.9-2ubuntu1.5 [572 kB]
Get:6 http://archive.ubuntu.com/ubuntu focal/main amd64 libxext6 amd64 2:1.3.4-0ubuntu1 [29.1 kB]
Get:7 http://archive.ubuntu.com/ubuntu focal/main amd64 fonts-dejavu-core all 2.37-1 [1041 kB]
Get:8 http://archive.ubuntu.com/ubuntu focal/main amd64 fontconfig-config all 2.13.1-2ubuntu3 [28.8 kB]
Get:9 http://archive.ubuntu.com/ubuntu focal/main amd64 libboost-iostreams1.71.0 amd64 1.71.0-6ubuntu6 [237 kB]
Get:10 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libfreetype6 amd64 2.10.1-2ubuntu0.3 [341 kB]
Get:11 http://archive.ubuntu.com/ubuntu focal/main amd64 libfontconfig1 amd64 2.13.1-2ubuntu3 [114 kB]
Get:12 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 libpixman-1-0 amd64 0.38.4-0ubuntu2.1 [227 kB]
Get:13 http://archive.ubuntu.com/ubuntu focal/main amd64 libxcb-render0 amd64 1.14-2 [14.8 kB]
Get:14 http://archive.ubuntu.com/ubuntu focal/main amd64 libxcb-shm0 amd64 1.14-2 [5584 B]
Get:15 http://archive.ubuntu.com/ubuntu focal/main amd64 libxrender1 amd64 1:0.9.10-1 [18.7 kB]
Get:16 http://archive.ubuntu.com/ubuntu focal/main amd64 libcairo2 amd64 1.16.0-4ubuntu1 [583 kB]
Get:17 http://archive.ubuntu.com/ubuntu focal/universe amd64 libschroedinger-maeparser1 amd64 1.2.2-1build1 [89.1 kB]
Get:18 http://archive.ubuntu.com/ubuntu focal/universe amd64 libopenbabel6 amd64 3.0.0+dfsg-3ubuntu3 [3568 kB]
Get:19 http://archive.ubuntu.com/ubuntu focal/universe amd64 openbabel amd64 3.0.0+dfsg-3ubuntu3 [127 kB]
Fetched 7173 kB in 13s (560 kB/s)                                              
debconf: unable to initialize frontend: Dialog
debconf: (No usable dialog-like program is installed, so the dialog based frontend cannot be used. at /usr/share/perl5/Debconf/FrontEnd/Dialog.pm line 76, <> line 19.)
debconf: falling back to frontend: Readline
Selecting previously unselected package libxau6:amd64.
(Reading database ... 36894 files and directories currently installed.)
Preparing to unpack .../00-libxau6_1%3a1.0.9-0ubuntu1_amd64.deb ...
Unpacking libxau6:amd64 (1:1.0.9-0ubuntu1) ...
Selecting previously unselected package libxdmcp6:amd64.
Preparing to unpack .../01-libxdmcp6_1%3a1.1.3-0ubuntu1_amd64.deb ...
Unpacking libxdmcp6:amd64 (1:1.1.3-0ubuntu1) ...
Selecting previously unselected package libxcb1:amd64.
Preparing to unpack .../02-libxcb1_1.14-2_amd64.deb ...
Unpacking libxcb1:amd64 (1.14-2) ...
Selecting previously unselected package libx11-data.
Preparing to unpack .../03-libx11-data_2%3a1.6.9-2ubuntu1.5_all.deb ...
Unpacking libx11-data (2:1.6.9-2ubuntu1.5) ...
Selecting previously unselected package libx11-6:amd64.
Preparing to unpack .../04-libx11-6_2%3a1.6.9-2ubuntu1.5_amd64.deb ...
Unpacking libx11-6:amd64 (2:1.6.9-2ubuntu1.5) ...
Selecting previously unselected package libxext6:amd64.
Preparing to unpack .../05-libxext6_2%3a1.3.4-0ubuntu1_amd64.deb ...
Unpacking libxext6:amd64 (2:1.3.4-0ubuntu1) ...
Selecting previously unselected package fonts-dejavu-core.
Preparing to unpack .../06-fonts-dejavu-core_2.37-1_all.deb ...
Unpacking fonts-dejavu-core (2.37-1) ...
Selecting previously unselected package fontconfig-config.
Preparing to unpack .../07-fontconfig-config_2.13.1-2ubuntu3_all.deb ...
Unpacking fontconfig-config (2.13.1-2ubuntu3) ...
Selecting previously unselected package libboost-iostreams1.71.0:amd64.
Preparing to unpack .../08-libboost-iostreams1.71.0_1.71.0-6ubuntu6_amd64.deb ...
Unpacking libboost-iostreams1.71.0:amd64 (1.71.0-6ubuntu6) ...
Selecting previously unselected package libfreetype6:amd64.
Preparing to unpack .../09-libfreetype6_2.10.1-2ubuntu0.3_amd64.deb ...
Unpacking libfreetype6:amd64 (2.10.1-2ubuntu0.3) ...
Selecting previously unselected package libfontconfig1:amd64.
Preparing to unpack .../10-libfontconfig1_2.13.1-2ubuntu3_amd64.deb ...
Unpacking libfontconfig1:amd64 (2.13.1-2ubuntu3) ...
Selecting previously unselected package libpixman-1-0:amd64.
Preparing to unpack .../11-libpixman-1-0_0.38.4-0ubuntu2.1_amd64.deb ...
Unpacking libpixman-1-0:amd64 (0.38.4-0ubuntu2.1) ...
Selecting previously unselected package libxcb-render0:amd64.
Preparing to unpack .../12-libxcb-render0_1.14-2_amd64.deb ...
Unpacking libxcb-render0:amd64 (1.14-2) ...
Selecting previously unselected package libxcb-shm0:amd64.
Preparing to unpack .../13-libxcb-shm0_1.14-2_amd64.deb ...
Unpacking libxcb-shm0:amd64 (1.14-2) ...
Selecting previously unselected package libxrender1:amd64.
Preparing to unpack .../14-libxrender1_1%3a0.9.10-1_amd64.deb ...
Unpacking libxrender1:amd64 (1:0.9.10-1) ...
Selecting previously unselected package libcairo2:amd64.
Preparing to unpack .../15-libcairo2_1.16.0-4ubuntu1_amd64.deb ...
Unpacking libcairo2:amd64 (1.16.0-4ubuntu1) ...
Selecting previously unselected package libschroedinger-maeparser1:amd64.
Preparing to unpack .../16-libschroedinger-maeparser1_1.2.2-1build1_amd64.deb ...
Unpacking libschroedinger-maeparser1:amd64 (1.2.2-1build1) ...
Selecting previously unselected package libopenbabel6.
Preparing to unpack .../17-libopenbabel6_3.0.0+dfsg-3ubuntu3_amd64.deb ...
Unpacking libopenbabel6 (3.0.0+dfsg-3ubuntu3) ...
Selecting previously unselected package openbabel.
Preparing to unpack .../18-openbabel_3.0.0+dfsg-3ubuntu3_amd64.deb ...
Unpacking openbabel (3.0.0+dfsg-3ubuntu3) ...
Setting up libpixman-1-0:amd64 (0.38.4-0ubuntu2.1) ...
Setting up libxau6:amd64 (1:1.0.9-0ubuntu1) ...
Setting up libxdmcp6:amd64 (1:1.1.3-0ubuntu1) ...
Setting up libxcb1:amd64 (1.14-2) ...
Setting up libxcb-render0:amd64 (1.14-2) ...
Setting up libxcb-shm0:amd64 (1.14-2) ...
Setting up libboost-iostreams1.71.0:amd64 (1.71.0-6ubuntu6) ...
Setting up libschroedinger-maeparser1:amd64 (1.2.2-1build1) ...
Setting up libfreetype6:amd64 (2.10.1-2ubuntu0.3) ...
Setting up libx11-data (2:1.6.9-2ubuntu1.5) ...
Setting up fonts-dejavu-core (2.37-1) ...
Setting up libx11-6:amd64 (2:1.6.9-2ubuntu1.5) ...
Setting up libxrender1:amd64 (1:0.9.10-1) ...
Setting up fontconfig-config (2.13.1-2ubuntu3) ...
Setting up libxext6:amd64 (2:1.3.4-0ubuntu1) ...
Setting up libfontconfig1:amd64 (2.13.1-2ubuntu3) ...
Setting up libcairo2:amd64 (1.16.0-4ubuntu1) ...
Setting up libopenbabel6 (3.0.0+dfsg-3ubuntu3) ...
Setting up openbabel (3.0.0+dfsg-3ubuntu3) ...
Processing triggers for libc-bin (2.31-0ubuntu9.7) ...
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libcudadebugger.so.525.89.02 is empty, not checked.
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libvdpau_nvidia.so.525.89.02 is empty, not checked.
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-opticalflow.so.525.89.02 is empty, not checked.
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-nvvm.so.4 is empty, not checked.
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvcuvid.so.525.89.02 is empty, not checked.
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-compiler.so.525.89.02 is empty, not checked.
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-cfg.so.525.89.02 is empty, not checked.
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-opencl.so.525.89.02 is empty, not checked.
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libcudadebugger.so.1 is empty, not checked.
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-nvvm.so.525.89.02 is empty, not checked.
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.525.89.02 is empty, not checked.
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-ml.so.525.89.02 is empty, not checked.
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-allocator.so.525.89.02 is empty, not checked.
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libcuda.so.525.89.02 is empty, not checked.
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-encode.so.525.89.02 is empty, not checked.
Processing triggers for man-db (2.9.1-1) ...
代码
文本
[12]
!conda create -n mgltools mgltools -c bioconda -y
Retrieving notices: ...working... /opt/conda/lib/python3.8/site-packages/urllib3/connectionpool.py:1056: InsecureRequestWarning: Unverified HTTPS request is being made to host 'repo.anaconda.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
  warnings.warn(
/opt/conda/lib/python3.8/site-packages/urllib3/connectionpool.py:1056: InsecureRequestWarning: Unverified HTTPS request is being made to host 'repo.anaconda.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
  warnings.warn(
/opt/conda/lib/python3.8/site-packages/urllib3/connectionpool.py:1056: InsecureRequestWarning: Unverified HTTPS request is being made to host 'conda.anaconda.org'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
  warnings.warn(
/opt/conda/lib/python3.8/site-packages/urllib3/connectionpool.py:1056: InsecureRequestWarning: Unverified HTTPS request is being made to host 'conda.anaconda.org'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
  warnings.warn(
done
Collecting package metadata (current_repodata.json): done
Solving environment: done


==> WARNING: A newer version of conda exists. <==
  current version: 23.5.0
  latest version: 23.7.2

Please update conda by running

    $ conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

     conda install conda=23.7.2



## Package Plan ##

  environment location: /opt/conda/envs/mgltools

  added / updated specs:
    - mgltools


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    mgltools-1.5.7             |       h9ee0642_1       107.0 MB  bioconda
    ------------------------------------------------------------
                                           Total:       107.0 MB

The following NEW packages will be INSTALLED:

  mgltools           bioconda/linux-64::mgltools-1.5.7-h9ee0642_1 



Downloading and Extracting Packages
                                                                                
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate mgltools
#
# To deactivate an active environment, use
#
#     $ conda deactivate

代码
文本
[13]
! obabel -isdf /data/CASF-2016/casf2016/3bgz_ligand.sdf -O 3bgz_ligand.pdbqt
1 molecule converted
代码
文本

安装好必要的依赖后,把蛋白变成分子对接时适用的pdbqt文件

代码
文本
[14]
# ! conda run -n mgltools pythonsh $MGLUTIL/prepare_receptor4.py -r target_protein_align_pocket.pdb -o target_protein_align_pocket.pdbqt
# ! export MGLUTIL=/opt/conda/envs/mgltools/MGLToolsPckgs/AutoDockTools/Utilities24/
! conda run -n mgltools pythonsh /opt/conda/envs/mgltools/MGLToolsPckgs/AutoDockTools/Utilities24/prepare_receptor4.py -r /data/CASF-2016/casf2016/3bgz_protein.pdb -o 3bgz_protein.pdbqt
adding gasteiger charges to peptide

代码
文本
[15]
! ./unidock --receptor ./3bgz_protein.pdbqt --gpu_batch ./CASF-2016/casf2016/3bgz_ligand.sdf --dir ./ --exhaustiveness 32 --config ./3bgz_grid.txt --bias /data/CASF-2016/unimolresults/docking.3bgz.0.sdf.bpf --num_modes 500 --energy_range 20 --min_rmsd 2.0 | tee sample_trival_bias_rmsd2.log
# ! ./unidock --receptor ./3bgz_protein.pdbqt --gpu_batch ./CASF-2016/casf2016/3bgz_ligand.sdf --dir ./ --exhaustiveness 32 --config ./3bgz_grid.txt --num_modes 500 --energy_range 20 --min_rmsd 2.0 | tee sample_trival_rmsd2.log
Uni-Dock v0.1.0

If you used Uni-Dock in your work, please cite:               
 
Yu, Y., Cai, C., Wang, J., Bo, Z., Zhu, Z., & Zheng, H. (2023). 
Uni-Dock: GPU-Accelerated Docking Enables Ultralarge Virtual Screening. 
Journal of Chemical Theory and Computation.                    
https://doi.org/10.1021/acs.jctc.2c01145                       

Tang, S., Chen, R., Lin, M., Lin, Q., Zhu, Y., Ding, J., ... & Wu, J. (2022). 
Accelerating autodock vina with gpus. Molecules, 27(9), 3041. 
DOI 10.3390/molecules27093041                                 

J. Eberhardt, D. Santos-Martins, A. F. Tillack, and S. Forli  
AutoDock Vina 1.2.0: New Docking Methods, Expanded Force      
Field, and Python Bindings, J. Chem. Inf. Model. (2021)       
DOI 10.1021/acs.jcim.1c00203                                  

O. Trott, A. J. Olson,                                        
AutoDock Vina: improving the speed and accuracy of docking    
with a new scoring function, efficient optimization and        
multithreading, J. Comp. Chem. (2010)                         
DOI 10.1002/jcc.21334                                         

Please refer to https://github.com/dptech-corp/Uni-Dock/ for  
bug reporting, license agreements, and more information.      

Scoring function : vina
Rigid receptor: ./3bgz_protein.pdbqt
Grid center: X -18 Y 38 Z 0
Grid size  : X 20 Y 20 Z 20
Grid space : 0.375
Exhaustiveness: 384
CPU: 0
Verbosity: 1

Computing Vina grid ... entering done
done.
exiting done
Total ligands: 1
No fragment info, using rigid dockingAvaliable Memory = 14908MiB   Total Memory = 15109MiB

Batch 1 size: 1
Performing docking (random seed: 1700959471) ... Kernel running time: 2
entering done
exiting done

mode |   affinity | dist from best mode
     | (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
   1       -16.34          0          0
   2       -13.95      2.963      6.039
   3       -13.58      1.252      4.349
   4       -12.51      3.548       6.87
   5       -12.44      3.401      5.788
   6        -12.4      2.338      5.473
   7       -10.84      3.322       5.04
   8       -10.83      2.219      4.946
   9       -10.35      4.661      8.085
  10       -10.08       3.05      5.692
  11       -6.566      4.367      8.141
  12        -6.41      6.541      10.06
  13       -6.094      7.473      10.12
  14       -5.742      6.364      9.853
  15       -5.393      7.327      10.85
  16         -5.2      6.196       9.69
  17       -5.179      5.454      8.769
  18       -4.593       8.25      11.39
  19       -4.134      6.313      9.227
  20       -3.973      7.675      9.852
  21       -3.089      7.789      10.01
  22      -0.5928      7.998      11.15
  23        314.2       9.07      11.18
Batch 1 running time: 3696ms
代码
文本
双击即可修改
代码
文本

然后可以查看新的分子对接结果。 看施加了bias之后unidock分子对接的结果。

代码
文本
[21]
pdb_path = os.path.join(data_path, 'casf2016', pdb_id+'_protein.pdb')
# ligand_path = os.path.join(data_path, "unimolresults","docking.{}.{}.sdf".format(pdb_id,0))
# gt_ligand_path = os.path.join(data_path, pdb_id, "target_align_pocket/", "target_ligand_align_pocket.sdf")
# ligand_path = "./3bgz_ligand_out.sdf"
ligand_path = "./3bgz_ligand_out.sdf"

gt_ligand_path = os.path.join(data_path,'casf2016',pdb_id+'_ligand.sdf')

view = py3Dmol.view()
view.removeAllModels()

view.addModel(open(pdb_path,'r').read(),format='pdb')
view.setStyle({'cartoon': {'arrows':True, 'tubes':False, 'style':'oval', 'color':'white'}})
view.addSurface(py3Dmol.VDW,{'opacity':0.5,'color':'white'})

view.addModel(open(ligand_path,'r').read(),format='sdf')
ref_m = view.getModel()
ref_m.setStyle({},{'stick':{'colorscheme':'greenCarbon','radius':0.2}})

view.zoomTo(viewer=(100,0))
view.show()

view.removeAllModels()


view.addModel(open(ligand_path,'r').read(),format='sdf')
ref_m = view.getModel()
ref_m.setStyle({},{'stick':{'colorscheme':'greenCarbon','radius':0.2}})


view.addModel(open(gt_ligand_path,'r').read(),format='sdf')
ref_m = view.getModel()
ref_m.setStyle({},{'stick':{'colorscheme':'redCarbon','radius':0.2}})

unimol_ligand_path = os.path.join(data_path, "unimolresults","docking.{}.{}.sdf".format(pdb_id,0))
view.addModel(open(unimol_ligand_path,'r').read(),format='sdf')
ref_m = view.getModel()
ref_m.setStyle({},{'stick':{'colorscheme':'purpleCarbon','radius':0.2}})

view.zoomTo(viewer=(100,0))
view.show()
代码
文本

图中

  1. 紫色分子即为unimol直接生成的分子。
  2. 绿色分子为unidock生成的分子
  3. 红色分子为ground true分子。 image.png
代码
文本
双击即可修改
代码
文本
RDKit
分子对接
Deep Learning
Uni-Mol
Uni-Dock
RDKit分子对接Deep LearningUni-MolUni-Dock
已赞1
本文被以下合集收录
Unimol
bohradf786
更新于 2024-07-24
2 篇0 人关注
CADD
9c5545
更新于 2024-04-03
7 篇0 人关注
推荐阅读
公开
哥伦布训练营|DeePMD——正极材料实战之性质计算篇
正极材料
正极材料
zhanglinshuang
发布于 2023-10-22
2 赞7 转存文件
公开
Pymatgen自动生成表面吸附模型-高通量计算系列教程
AlloyPymatgen化学信息学
AlloyPymatgen化学信息学
wanghongshuai@dp.tech
发布于 2023-09-18
4 赞5 转存文件
评论
  ## 引言 **分子对接**(Qua...

songk@dp.tech

2023-09-08
这里copy的QSAR信息没有删除干净啊
评论
 ### docking #### ###...

Weiliang Luo

2023-07-31
这里开始报错找不到3bgz_protein.pdb

九歌

2023-09-02
运行完这部分后报错说分子中没有原子

唐浩程

11-13 09:42
ValueError: molecule has no atoms

zhangpl@stu.pku.edu.cn

01-30 08:35
submiter = widgets.Button(     description='选择',     button_style='success',     layout=Layout(width='36%', height='30px'),

zhangpl@stu.pku.edu.cn

01-30 08:36
在这个代码块后,有一个选择,可以点击一下,然后再运行下个代码块。试试是否可以解决
评论