Bohrium
robot
新建

空间站广场

论文
Notebooks
比赛
课程
Apps
我的主页
我的Notebooks
我的论文库
我的足迹

我的工作空间

任务
节点
文件
数据集
镜像
项目
数据库
公开
化學信息學 | Torsion Tree與統一的Docking Pipeline流程介紹
化学信息学
RDKit
Uni-Dock
watvina
Virtual Screening
化学信息学RDKitUni-DockwatvinaVirtual Screening
mick3101
发布于 2023-08-01
推荐镜像 :drug-design:2023_08_01_2
推荐机型 :c16_m32_cpu
赞 2
Cheminformatics | Torsion Tree & Docking Protocol
Goals
Torsion Tree Introduction
Torsion Tree Algorithm
Imports
Atom Types
Rotatable Bonds
Atom Info
general cases
covalent docking treatment
template docking treatment
Docking Protocol
Imports
未完待續.....後續一般包含聚類與相互作用分析(IFP)等功能,完善虛篩流程。

Cheminformatics | Torsion Tree & Docking Protocol

日期:2023-07-31

共享协议:本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。

🏃🏻 快速开始
您可以直接在 Bohrium Notebook 上执行此文档。首先,请点击位于界面顶部的 开始连接 按钮,然后选择 drug-design:2023_08_01_2镜像,如docking_engine使用Uni-Dock則需使用GPU節點。

📖 作者
林泓叡 📨

📖 備註
本 Notebook 將持續優化,當中部分代碼尚未開源,將來開源後調用方式會稍有不同,但保持大致框架。另外功能部分也將持續提升,部份內部接口也將做調整。

代码
文本

Goals

代码
文本

本Notebook主要介紹分子對接中Torsion tree的一種算法實現,以及針對應用場景的前後處理算法與流程,最後完善為一整套整合的docking pipeline

代码
文本

Torsion Tree Introduction

代码
文本

在CADD領域中,分子對接是一項廣泛被使用的用來確定分子與靶點結合構像(binding pose)的方法。除去商業軟件不談,在眾多開源的docking軟件中,AutoDock系列工具屬於被使用最多。Scripps為AutoDock系列軟件定義了一種小分子構像採樣專用的topology的描述形式,稱為torsion tree,在AutoDock4, ADFR, AutoDock-GPU, Vina等系列工具中都被使用,以及後來的gnina, watvina, Uni-Dock都沿用了這樣一種小分子topology的表達形式。

一個小分子torsion tree的範例如下圖所示:

image.png

代码
文本

在docking採樣的算法中,我們關心的是小分子的平移、轉動,與各個可旋轉的二面角。因此torsion tree的設定中,便以此種方式,將小分子根據可旋轉鍵切分成若干個fragments,並根據小分子整體的平移,旋轉,與二面角的採樣來決定小分子的整體構型。上圖左是一個範例的小分子,右圖是他對應的torsion tree結構,可以看出,這個分子根據可旋轉鍵,被切分成了10個fragments,並且在我們的算法中,將最大的一個fragment稱為root,root是整個torsion tree的起點,往下根據可旋轉鍵連接,建立了各個分支,遍歷每個fragments直到構建完整個torsion tree。 而在constraint docking與covalent docking的場景中,root具有特殊的意義,根據規定,root fragment應該是被固定不動的部分,並且不參與對接流程的採樣。

建立於torsion tree之上,docking的採樣算法思路即是考慮了分子的平移、旋轉、各個可旋轉鍵等自由度,根據打分函數的指引對這些自由度進行採樣,由全局搜索加上一個局部結構優化所組成,(不同軟件具體算法會有所不同),如下圖所示:

(取自ADFR文獻: https://doi.org/10.1371/journal.pcbi.1004586)

image.png

代码
文本

長久以來,有許多開源的小分子docking inputs準備工具,可以用來生成torsion tree (PDBQT文件),例如scripps自己開發的MGLTools, openbabel, meeko等,但這些公開的算法都牽涉到很大一部份猜鍵級的過程,一般這些算法根據小分子3D結構的座標以及構型輔以一些規則,來判斷鍵級。而這過程是很容易出錯的,對於較為複雜的官能團,這樣的判斷往往容易與真實的鍵級有出入。因此本notebook介紹一種完全依賴來自小分子sdf文件紀錄的鍵級信息(透過rdkit與networkx實現)來完成torsion tree,也就是PDBQT與unidock-style sdf的輸出的算法實現。

代码
文本

Torsion Tree Algorithm

代码
文本

Imports

代码
文本
[1]
import os
import re
from copy import deepcopy

import numpy as np
import networkx as nx

from rdkit import Chem
from rdkit.Chem import ChemicalForceFields, GetMolFrags, FragmentOnBonds
from rdkit.Chem.rdMolAlign import AlignMol
from rdkit.Chem import rdFMCS
from rdkit.Chem.rdPartialCharges import ComputeGasteigerCharges
代码
文本

首先定義AutoDock torsion tree裡需要使用的小分子atom types(根據不同軟件的打分函數而有所不同),以及rotatable bonds匹配規則。我們這裡根據非常簡單的SMARTS規則來定義,(便於擴展至複雜官能團),而不是10多年前傳統的手動羅列各種官能團建立字典的形式:

代码
文本

Atom Types

代码
文本
[2]
ATOM_TYPE_DEFINITION_LIST = [{'smarts': '[#1]', 'atype': 'H', 'comment': 'invisible'},
{'smarts': '[#1][#7,#8,#9,#15,#16]', 'atype': 'HD', 'comment': None},
{'smarts': '[#5]', 'atype': 'B', 'comment': None},
{'smarts': '[C]', 'atype': 'C', 'comment': None},
{'smarts': '[c]', 'atype': 'A', 'comment': None},
{'smarts': '[#7]', 'atype': 'NA', 'comment': None},
{'smarts': '[#8]', 'atype': 'OA', 'comment': None},
{'smarts': '[#9]', 'atype': 'F', 'comment': None},
{'smarts': '[#12]', 'atype': 'Mg', 'comment': None},
{'smarts': '[#14]', 'atype': 'Si', 'comment': None},
{'smarts': '[#15]', 'atype': 'P', 'comment': None},
{'smarts': '[#16]', 'atype': 'S', 'comment': None},
{'smarts': '[#17]', 'atype': 'Cl', 'comment': None},
{'smarts': '[#20]', 'atype': 'Ca', 'comment': None},
{'smarts': '[#25]', 'atype': 'Mn', 'comment': None},
{'smarts': '[#26]', 'atype': 'Fe', 'comment': None},
{'smarts': '[#30]', 'atype': 'Zn', 'comment': None},
{'smarts': '[#35]', 'atype': 'Br', 'comment': None},
{'smarts': '[#53]', 'atype': 'I', 'comment': None},
{'smarts': '[#7X3v3][a]', 'atype': 'N', 'comment': 'pyrrole, aniline'},
{'smarts': '[#7X3v3][#6X3v4]', "atype": 'N', 'comment': 'amide'},
{'smarts': '[#7+1]', 'atype': 'N', 'comment': 'ammonium, pyridinium'},
{'smarts': '[SX2]', 'atype': 'SA', 'comment': 'sulfur acceptor'}]

class AtomType(object):
def __init__(self):
self.atom_type_definition_list = ATOM_TYPE_DEFINITION_LIST

def assign_atom_types(self, mol):
for atom_type_dict in self.atom_type_definition_list:
smarts = atom_type_dict['smarts']
atom_type = atom_type_dict['atype']

pattern_mol = Chem.MolFromSmarts(smarts)
pattern_matches = mol.GetSubstructMatches(pattern_mol)
for pattern_match in pattern_matches:
atom = mol.GetAtomWithIdx(pattern_match[0])
atom.SetProp('atom_type', atom_type)
代码
文本

Rotatable Bonds

代码
文本

旋轉鍵部份,SMARTS pattern採用Lipinski規則:

其中也保留了後續對於共軛,與大環採樣實現的參數。

代码
文本
[3]
class RotatableBond(object):
def __init__(self,
min_macrocycle_size=7,
max_macrocycle_size=33,
double_bond_penalty=50,
max_breaks=4):

self.rotatable_bond_smarts = '[!$(*#*)&!D1]-&!@[!$(*#*)&!D1]'
self.conjugate_bond_smarts = '*=*[*]=,#,:[*]'
self.rotatable_bond_pattern = Chem.MolFromSmarts(self.rotatable_bond_smarts)
self.conjugate_bond_pattern = Chem.MolFromSmarts(self.conjugate_bond_smarts)

self.min_macrocycle_size = min_macrocycle_size
self.max_macrocycle_size = max_macrocycle_size
self.double_bond_penalty = double_bond_penalty
self.max_breaks = max_breaks

def identify_rotatable_bonds(self, mol):
default_rotatable_bond_info_list = list(mol.GetSubstructMatches(self.rotatable_bond_pattern))
return default_rotatable_bond_info_list
代码
文本

而每個已定義的atom types打分函數則來自於各自軟件的規範,例如AutoDock4會將力場的參數值寫進 AD4.1_bound.dat 文件中:

代码
文本
[4]
# $Id: AD4.1_bound.dat,v 1.1 2016/02/06 00:10:27 pradeep Exp $
#
# AutoDock
#
# Copyright (C) 1989-2007, Garrett M. Morris, David S. Goodsell, Ruth Huey, Arthur J. Olson,
# All Rights Reserved.
#
# AutoDock is a Trade Mark of The Scripps Research Institute.
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.

# AutoDock Linear Free Energy Model Coefficients and Energetic Parameters
# Version 4.1 Bound
# $Revision: 1.4 $

# FE_unbound_model is used to specify how the internal energy of the
# ligand should be treated when estimating the free energy of binding,
# and can be set to one of the following strings:
# unbound_same_as_bound, extended, or compact
# unbound_same_as_bound -- this assumes the internal energy of the ligand is the
# same before and after binding.
# extended -- this assumes the internal energy of the ligand is that of an
# extended conformation when unbound.
# compact -- this assumes the internal energy of the ligand is that of a
# compact conformation when unbound.
#FE_unbound_model unbound_same_as_bound

# AutoDock 4 free energy coefficients with respect to original (AD2) energetic parameters
# This model assumes that the bound and unbound conformations are the same.
# See Table 3 in Huey,Morris,Olson&Goodsell (2007) J Comput Chem 28: 1145-1152.
#
# Free Energy Coefficient
# ------
FE_coeff_vdW 0.1662
FE_coeff_hbond 0.1209
FE_coeff_estat 0.1406
FE_coeff_desolv 0.1322
FE_coeff_tors 0.2983

# AutoDock 4 Energy Parameters

# - Atomic solvation volumes and parameters
# - Unweighted vdW and Unweighted H-bond Well Depths
#
# - Atom Types
# - Rii = sum of vdW radii of two like atoms (in Angstrom)
# - epsii = vdW well depth (in Kcal/mol)
# - vol = atomic solvation volume (in Angstrom^3)
# - solpar = atomic solvation parameter
# - Rij_hb = H-bond radius of the heteroatom in contact with a hydrogen (in Angstrom)
# - epsij_hb = well depth of H-bond (in Kcal/mol)
# - hbond = integer indicating type of H-bonding atom (0=no H-bond)
# - rec_index = initialised to -1, but later on holds count of how many of this atom type are in receptor
# - map_index = initialised to -1, but later on holds the index of the AutoGrid map
# - bond_index = used in AutoDock to detect bonds; see "mdist.h", enum {C,N,O,H,XX,P,S}
#
# - To obtain the Rij value for non H-bonding atoms, calculate the
# arithmetic mean of the Rii values for the two atom types.
# Rij = (Rii + Rjj) / 2
#
# - To obtain the epsij value for non H-bonding atoms, calculate the
# geometric mean of the epsii values for the two atom types.
# epsij = sqrt( epsii * epsjj )
#
# - Note that the Rij_hb value is non-zero for heteroatoms only, and zero for H atoms;
# to obtain the length of an H-bond, look up Rij_hb for the heteroatom only;
# this is combined with the Rii value for H in the receptor, in AutoGrid.
# For example, the Rij_hb for OA-HD H-bonds will be (1.9 + 1.0) Angstrom,
# and the weighted epsij_hb will be 5.0 kcal/mol * FE_coeff_hbond.
#
# Atom Rii Rij_hb rec_index
# Type epsii solpar epsij_hb map_index
# vol hbond bond_index
# -- ---- ----- ------- -------- --- --- - -- -- --
atom_par H 2.00 0.020 0.0000 0.00051 0.0 0.0 0 -1 -1 3 # Non H-bonding Hydrogen
atom_par HD 2.00 0.020 0.0000 0.00051 0.0 0.0 2 -1 -1 3 # Donor 1 H-bond Hydrogen
atom_par HS 2.00 0.020 0.0000 0.00051 0.0 0.0 1 -1 -1 3 # Donor S Spherical Hydrogen
atom_par C 4.00 0.150 33.5103 -0.00143 0.0 0.0 0 -1 -1 0 # Non H-bonding Aliphatic Carbon
atom_par A 4.00 0.150 33.5103 -0.00052 0.0 0.0 0 -1 -1 0 # Non H-bonding Aromatic Carbon
atom_par N 3.50 0.160 22.4493 -0.00162 0.0 0.0 0 -1 -1 1 # Non H-bonding Nitrogen
atom_par NA 3.50 0.160 22.4493 -0.00162 1.9 5.0 4 -1 -1 1 # Acceptor 1 H-bond Nitrogen
atom_par NS 3.50 0.160 22.4493 -0.00162 1.9 5.0 3 -1 -1 1 # Acceptor S Spherical Nitrogen
atom_par OA 3.20 0.200 17.1573 -0.00251 1.9 5.0 5 -1 -1 2 # Acceptor 2 H-bonds Oxygen
atom_par OS 3.20 0.200 17.1573 -0.00251 1.9 5.0 3 -1 -1 2 # Acceptor S Spherical Oxygen
atom_par F 3.09 0.080 15.4480 -0.00110 0.0 0.0 0 -1 -1 4 # Non H-bonding Fluorine
atom_par Mg 1.30 0.875 1.5600 -0.00110 0.0 0.0 0 -1 -1 4 # Non H-bonding Magnesium
atom_par MG 1.30 0.875 1.5600 -0.00110 0.0 0.0 0 -1 -1 4 # Non H-bonding Magnesium
atom_par P 4.20 0.200 38.7924 -0.00110 0.0 0.0 0 -1 -1 5 # Non H-bonding Phosphorus
atom_par SA 4.00 0.200 33.5103 -0.00214 2.5 1.0 5 -1 -1 6 # Acceptor 2 H-bonds Sulphur
atom_par S 4.00 0.200 33.5103 -0.00214 0.0 0.0 0 -1 -1 6 # Non H-bonding Sulphur
atom_par Se 4.00 0.200 33.5103 -0.00214 2.5 1.0 5 -1 -1 6 # Acceptor 2 H-bonds Sulphur
atom_par Cl 4.09 0.276 35.8235 -0.00110 0.0 0.0 0 -1 -1 4 # Non H-bonding Chlorine
atom_par CL 4.09 0.276 35.8235 -0.00110 0.0 0.0 0 -1 -1 4 # Non H-bonding Chlorine
atom_par Ca 1.98 0.550 2.7700 -0.00110 0.0 0.0 0 -1 -1 4 # Non H-bonding Calcium
atom_par CA 1.98 0.550 2.7700 -0.00110 0.0 0.0 0 -1 -1 4 # Non H-bonding Calcium
atom_par Mn 1.30 0.875 2.1400 -0.00110 0.0 0.0 0 -1 -1 4 # Non H-bonding Manganese
atom_par MN 1.30 0.875 2.1400 -0.00110 0.0 0.0 0 -1 -1 4 # Non H-bonding Manganese
atom_par Fe 1.30 0.010 1.8400 -0.00110 0.0 0.0 0 -1 -1 4 # Non H-bonding Iron
atom_par FE 1.30 0.010 1.8400 -0.00110 0.0 0.0 0 -1 -1 4 # Non H-bonding Iron
atom_par Zn 1.48 0.550 1.7000 -0.00110 0.0 0.0 0 -1 -1 4 # Non H-bonding Zinc
atom_par ZN 1.48 0.550 1.7000 -0.00110 0.0 0.0 0 -1 -1 4 # Non H-bonding Zinc
atom_par Br 4.33 0.389 42.5661 -0.00110 0.0 0.0 0 -1 -1 4 # Non H-bonding Bromine
atom_par BR 4.33 0.389 42.5661 -0.00110 0.0 0.0 0 -1 -1 4 # Non H-bonding Bromine
atom_par I 4.72 0.550 55.0585 -0.00110 0.0 0.0 0 -1 -1 4 # Non H-bonding Iodine
atom_par Z 4.00 0.150 33.5103 -0.00143 0.0 0.0 0 -1 -1 0 # Non H-bonding covalent map
atom_par G 4.00 0.150 33.5103 -0.00143 0.0 0.0 0 -1 -1 0 # Ring closure Glue Aliphatic Carbon # SF
atom_par GA 4.00 0.150 33.5103 -0.00052 0.0 0.0 0 -1 -1 0 # Ring closure Glue Aromatic Carbon # SF
atom_par J 4.00 0.150 33.5103 -0.00143 0.0 0.0 0 -1 -1 0 # Ring closure Glue Aliphatic Carbon # SF
atom_par Q 4.00 0.150 33.5103 -0.00143 0.0 0.0 0 -1 -1 0 # Ring closure Glue Aliphatic Carbon # SF
atom_par TZ 1.00 0.000 0.0000 0.00000 0.0 0.0 0 -1 -1 0 # Tetrahedral Zinc Pseudo Atom
  Cell In[4], line 46
    FE_coeff_vdW    0.1662
                    ^
SyntaxError: invalid syntax
代码
文本

而vina系列則將打分函數直接寫進了內部程序,執行對接前自動計算docking grids的值。

代码
文本

Atom Info

代码
文本

這裡我們為了能夠在torsion tree上建立每個原子的所需信息,在這裡先設定每個atom的property,以及針對covalent docking與template docking的特殊處理:

代码
文本

general cases

代码
文本
[4]
def assign_atom_properties(mol):
atom_positions = mol.GetConformer().GetPositions()
num_atoms = mol.GetNumAtoms()

internal_atom_idx = 0
for atom_idx in range(num_atoms):
atom = mol.GetAtomWithIdx(atom_idx)
atom.SetIntProp('sdf_atom_idx', atom_idx+1)
if not atom.HasProp('atom_name'):
atom_element = atom.GetSymbol()
atom_name = atom_element + str(internal_atom_idx+1)
atom.SetProp('atom_name', atom_name)
atom.SetProp('residue_name', 'MOL')
atom.SetIntProp('residue_idx', 1)
atom.SetProp('chain_idx', 'A')
internal_atom_idx += 1

atom.SetDoubleProp('charge', atom.GetDoubleProp('_GasteigerCharge'))
atom.SetDoubleProp('x', atom_positions[atom_idx, 0])
atom.SetDoubleProp('y', atom_positions[atom_idx, 1])
atom.SetDoubleProp('z', atom_positions[atom_idx, 2])
代码
文本

covalent docking treatment

代码
文本

此部份有非常多非常規的rdkit操作,主要是用於構建covalent ligand,並盡可能在各種情況下保證算法的安全性。

算法包含構建不包含共價彈頭的sub mol(比rdkit自帶的操作更加安全的作法),以及設定covalent ligand的原子信息與座標。

代码
文本
[5]
def get_mol_without_indices(mol_input,
remove_indices=[],
keep_properties=[],
keep_mol_properties=[]):

mol_property_dict = {}
for mol_property_name in keep_mol_properties:
mol_property_dict[mol_property_name] = mol_input.GetProp(mol_property_name)

atom_list, bond_list, idx_map = [], [], {} # idx_map: {old: new}

for atom in mol_input.GetAtoms():

props = {}
for property_name in keep_properties:
if property_name in atom.GetPropsAsDict():
props[property_name] = atom.GetPropsAsDict()[property_name]

symbol = atom.GetSymbol()

if symbol.startswith('*'):
atom_symbol = '*'
props['molAtomMapNumber'] = atom.GetAtomMapNum()

elif symbol.startswith('R'):
atom_symbol = '*'
if len(symbol) > 1:
atom_map_num = int(symbol[1:])
else:
atom_map_num = atom.GetAtomMapNum()
props['molAtomMapNumber'] = atom_map_num
props['dummyLabel'] = 'R' + str(atom_map_num)
props['_MolFileRLabel'] = str(atom_map_num)

else:
atom_symbol = symbol

atom_list.append(
(
atom_symbol,
atom.GetChiralTag(),
atom.GetFormalCharge(),
atom.GetNumExplicitHs(),
props
)
)

for bond in mol_input.GetBonds():
bond_list.append(
(
bond.GetBeginAtomIdx(),
bond.GetEndAtomIdx(),
bond.GetBondType()
)
)

mol = Chem.RWMol(Chem.Mol())

new_idx = 0
for atom_index, atom_info in enumerate(atom_list):
if atom_index not in remove_indices:
atom = Chem.Atom(atom_info[0])
atom.SetChiralTag(atom_info[1])
atom.SetFormalCharge(atom_info[2])
atom.SetNumExplicitHs(atom_info[3])

for property_name in atom_info[4]:
if isinstance(atom_info[4][property_name], str):
atom.SetProp(property_name, atom_info[4][property_name])
elif isinstance(atom_info[4][property_name], int):
atom.SetIntProp(property_name, atom_info[4][property_name])

mol.AddAtom(atom)
idx_map[atom_index] = new_idx
new_idx += 1

for bond_info in bond_list:
if (
bond_info[0] not in remove_indices
and bond_info[1] not in remove_indices
):
mol.AddBond(
idx_map[bond_info[0]],
idx_map[bond_info[1]],
bond_info[2]
)

else:
one_in = False
if (
(bond_info[0] in remove_indices)
and (bond_info[1] not in remove_indices)
):
keep_index = bond_info[1]
remove_index = bond_info[0]
one_in = True
elif (
(bond_info[1] in remove_indices)
and (bond_info[0] not in remove_indices)
):
keep_index = bond_info[0]
remove_index = bond_info[1]
one_in = True
if one_in:
if atom_list[keep_index][0] in ['N', 'P']:
old_num_explicit_Hs = mol.GetAtomWithIdx(
idx_map[keep_index]
).GetNumExplicitHs()

mol.GetAtomWithIdx(idx_map[keep_index]).SetNumExplicitHs(
old_num_explicit_Hs + 1
)

mol = Chem.Mol(mol)

for mol_property_name in mol_property_dict:
mol.SetProp(mol_property_name, mol_property_dict[mol_property_name])

Chem.GetSymmSSSR(mol)
mol.UpdatePropertyCache(strict=False)
return mol
代码
文本
[6]
def prepare_covalent_ligand_mol(mol):
covalent_atom_idx_string = mol.GetProp('covalent_atom_indices')
covalent_atom_idx_string_list = covalent_atom_idx_string.split(',')
covalent_atom_idx_list = [int(covalent_atom_idx_string) for covalent_atom_idx_string in covalent_atom_idx_string_list]

covalent_atom_name_string = mol.GetProp('covalent_atom_names')
covalent_atom_name_list = covalent_atom_name_string.split(',')

covalent_residue_name_string = mol.GetProp('covalent_residue_names')
covalent_residue_name_list = covalent_residue_name_string.split(',')

covalent_residue_idx_string = mol.GetProp('covalent_residue_indices')
covalent_residue_idx_string_list = covalent_residue_idx_string.split(',')
covalent_residue_idx_list = [int(covalent_residue_idx_string) for covalent_residue_idx_string in covalent_residue_idx_string_list]

covalent_chain_idx_string = mol.GetProp('covalent_chain_indices')
covalent_chain_idx_list = covalent_chain_idx_string.split(',')

covalent_anchor_atom_info = (covalent_chain_idx_list[0], covalent_residue_name_list[0], covalent_residue_idx_list[0], covalent_atom_name_list[0])

removed_atom_idx_list = []

num_atoms = mol.GetNumAtoms()
internal_atom_idx = 0
atom_coords_dict = {}
conformer = mol.GetConformer()
for atom_idx in range(num_atoms):
atom = mol.GetAtomWithIdx(atom_idx)
if atom.GetAtomicNum() == 0:
atom.SetProp('atom_name', 'None')
removed_atom_idx_list.append(atom_idx)
else:
if atom_idx in covalent_atom_idx_list:
atom_name = covalent_atom_name_list[covalent_atom_idx_list.index(atom_idx)]
residue_name = covalent_residue_name_list[covalent_atom_idx_list.index(atom_idx)]
residue_idx = covalent_residue_idx_list[covalent_atom_idx_list.index(atom_idx)]
chain_idx = covalent_chain_idx_list[covalent_atom_idx_list.index(atom_idx)]
atom.SetProp('atom_name', atom_name)
atom.SetProp('residue_name', residue_name)
atom.SetIntProp('residue_idx', residue_idx)
atom.SetProp('chain_idx', chain_idx)
else:
atom_element = atom.GetSymbol()
atom_name = atom_element + str(internal_atom_idx+1)
atom.SetProp('atom_name', atom_name)
atom.SetProp('residue_name', 'MOL')
atom.SetIntProp('residue_idx', 1)
atom.SetProp('chain_idx', 'A')
internal_atom_idx += 1

atom_coords_point_3D = deepcopy(conformer.GetAtomPosition(atom_idx))
atom_coords_dict[atom_name] = atom_coords_point_3D

covalent_mol = get_mol_without_indices(mol, remove_indices=removed_atom_idx_list, keep_properties=['atom_name', 'residue_name', 'residue_idx', 'chain_idx'])
num_covalent_atoms = covalent_mol.GetNumAtoms()
covalent_conformer = Chem.Conformer(num_covalent_atoms)

for atom_idx in range(num_covalent_atoms):
atom = covalent_mol.GetAtomWithIdx(atom_idx)
atom_name = atom.GetProp('atom_name')
atom_coords_point_3D = atom_coords_dict[atom_name]
covalent_conformer.SetAtomPosition(atom_idx, atom_coords_point_3D)

_ = covalent_mol.AddConformer(covalent_conformer)
_ = Chem.SanitizeMol(covalent_mol)

return covalent_mol, covalent_anchor_atom_info
代码
文本

template docking treatment

代码
文本

core constraint docking部份,操作主要包含: 尋找參考分子與query小分子的最大公有子結構 (MCS),將query小分子構像根據core atom mapping,align之後進行簡單的經驗力場優化。

代码
文本
[7]
def get_template_docking_atom_mapping(reference_mol, query_mol):
mcs = rdFMCS.FindMCS([query_mol, reference_mol],
ringMatchesRingOnly=True,
completeRingsOnly=True,
atomCompare=rdFMCS.AtomCompare.CompareAnyHeavyAtom,
bondCompare=rdFMCS.BondCompare.CompareOrderExact,
timeout=60)

mcs_string = mcs.smartsString.replace('#0', '*')
generic_core_smarts_string = re.sub('\[\*.*?\]', '[*]', mcs_string)

generic_core_mol = Chem.MolFromSmarts(generic_core_smarts_string)

reference_atom_mapping = reference_mol.GetSubstructMatches(generic_core_mol)[0]
query_atom_mapping = query_mol.GetSubstructMatches(generic_core_mol)[0]

core_atom_mapping_dict = {reference_atom_idx: query_atom_idx for reference_atom_idx, query_atom_idx in zip(reference_atom_mapping, query_atom_mapping)}

return core_atom_mapping_dict
代码
文本
[8]
def get_core_alignment_for_template_docking(reference_mol, query_mol):
core_atom_mapping_dict = get_template_docking_atom_mapping(reference_mol, query_mol, atom_mapping_scheme=0)

core_atom_mapping_dict = {query_atom_idx: reference_atom_idx for reference_atom_idx, query_atom_idx in core_atom_mapping_dict.items()}
# the initial position of query_mol is random, so align to the reference_mol firstly
_ = AlignMol(query_mol, reference_mol, atomMap=list(core_atom_mapping_dict.items()))

# assign template positions from reference mol to query mol
core_fixed_query_conformer = Chem.Conformer(query_mol.GetNumAtoms())
reference_conformer = reference_mol.GetConformer()
query_conformer = query_mol.GetConformer()

for query_atom_idx in range(query_mol.GetNumAtoms()):
if query_atom_idx in core_atom_mapping_dict:
reference_atom_idx = core_atom_mapping_dict[query_atom_idx]
atom_position = reference_conformer.GetAtomPosition(reference_atom_idx)
core_fixed_query_conformer.SetAtomPosition(query_atom_idx, atom_position)
else:
atom_position = query_conformer.GetAtomPosition(query_atom_idx)
core_fixed_query_conformer.SetAtomPosition(query_atom_idx, atom_position)

query_mol.RemoveAllConformers()
query_mol.AddConformer(core_fixed_query_conformer)

# optimize conformer using chemical forcefield
ff_property = ChemicalForceFields.MMFFGetMoleculeProperties(query_mol, 'MMFF94s')
ff = ChemicalForceFields.MMFFGetMoleculeForceField(query_mol, ff_property, confId=0)

for query_atom_idx in core_atom_mapping_dict.keys():
reference_atom_idx = core_atom_mapping_dict[query_atom_idx]
core_atom_position = reference_conformer.GetAtomPosition(reference_atom_idx)
virtual_site_atom_idx = ff.AddExtraPoint(core_atom_position.x, core_atom_position.y, core_atom_position.z, fixed=True) - 1
ff.AddDistanceConstraint(virtual_site_atom_idx, query_atom_idx, 0, 0, 100.0)

ff.Initialize()

max_minimize_iteration = 5
for _ in range(max_minimize_iteration):
minimize_seed = ff.Minimize(energyTol=1e-4, forceTol=1e-3)
if minimize_seed == 0:
break

query_mol.SetProp('aligned_conformer_energy', str(ff.CalcEnergy()))

core_atom_idx_list = list(core_atom_mapping_dict.keys())

return core_atom_idx_list
代码
文本

介紹完了前置準備,接下來可以進入torsion tree的構建算法,這裡採用networkx的圖結構實現:

請注意,以下代碼僅僅展示一個實現範例,無法運行,具體調用請見下文。

代码
文本
[9]
def build_molecular_graph(self):
mol = Chem.SDMolSupplier(self.ligand_sdf_file_name, removeHs=False)[0]

if self.covalent_ligand:
mol, covalent_anchor_atom_info = utils.prepare_covalent_ligand_mol(mol)
else:
covalent_anchor_atom_info = None

if self.template_docking:
core_atom_idx_list = utils.get_core_alignment_for_template_docking(self.reference_mol, mol)
for core_atom_idx in core_atom_idx_list:
core_atom = mol.GetAtomWithIdx(core_atom_idx)
for neighbor_atom in core_atom.GetNeighbors():
if neighbor_atom.GetIdx() not in core_atom_idx_list:
core_atom_idx_list.remove(core_atom_idx)
break

else:
core_atom_idx_list = None

self.atom_typer.assign_atom_types(mol)
ComputeGasteigerCharges(mol)
utils.assign_atom_properties(mol)
rotatable_bond_info_list = self.rotatable_bond_finder.identify_rotatable_bonds(mol)

## Freeze bonds in covalent part to be unrotatable for covalent ligand case
##############################################################################
# if self.covalent_ligand:
# covalent_atom_idx_list = []
# num_atoms = mol.GetNumAtoms()
# for atom_idx in range(num_atoms):
# atom = mol.GetAtomWithIdx(atom_idx)
# if atom.GetProp('residue_name') != 'MOL':
# covalent_atom_idx_list.append(atom_idx)
#
# filtered_rotatable_bond_info_list = []
# for rotatable_bond_info in rotatable_bond_info_list:
# if rotatable_bond_info[0] in covalent_atom_idx_list and rotatable_bond_info[1] in covalent_atom_idx_list:
# continue
# else:
# filtered_rotatable_bond_info_list.append(rotatable_bond_info)
#
# rotatable_bond_info_list = filtered_rotatable_bond_info_list
##############################################################################

## Freeze bonds in core part to be unrotatable for template docking case
###############################################################################
if self.template_docking:
filtered_rotatable_bond_info_list = []
for rotatable_bond_info in rotatable_bond_info_list:
rotatable_begin_atom_idx = rotatable_bond_info[0]
rotatable_end_atom_idx = rotatable_bond_info[1]
if rotatable_begin_atom_idx in core_atom_idx_list and rotatable_end_atom_idx in core_atom_idx_list:
continue
else:
filtered_rotatable_bond_info_list.append(rotatable_bond_info)

rotatable_bond_info_list = filtered_rotatable_bond_info_list
###############################################################################

bond_list = list(mol.GetBonds())
rotatable_bond_idx_list = []
for bond in bond_list:
bond_info = (bond.GetBeginAtomIdx(), bond.GetEndAtomIdx())
bond_info_reversed = (bond.GetEndAtomIdx(), bond.GetBeginAtomIdx())
if bond_info in rotatable_bond_info_list or bond_info_reversed in rotatable_bond_info_list:
rotatable_bond_idx_list.append(bond.GetIdx())

splitted_mol = FragmentOnBonds(mol, rotatable_bond_idx_list, addDummies=False)
splitted_mol_list = list(GetMolFrags(splitted_mol, asMols=True, sanitizeFrags=False))

num_fragments = len(splitted_mol_list)

## Find fragment as the root node
##############################################################################
num_fragment_atoms_list = [None] * num_fragments
for fragment_idx in range(num_fragments):
fragment = splitted_mol_list[fragment_idx]
num_atoms = fragment.GetNumAtoms()
num_fragment_atoms_list[fragment_idx] = num_atoms

root_fragment_idx = None
if self.covalent_ligand:
for fragment_idx in range(num_fragments):
fragment = splitted_mol_list[fragment_idx]
for atom in fragment.GetAtoms():
atom_info = (atom.GetProp('chain_idx'), atom.GetProp('residue_name'), atom.GetIntProp('residue_idx'), atom.GetProp('atom_name'))
if atom_info == covalent_anchor_atom_info:
root_fragment_idx = fragment_idx
break

if root_fragment_idx is not None:
break

if root_fragment_idx is None:
raise ValueError('Bugs in root finding code for covalent docking!')

elif self.template_docking:
for fragment_idx in range(num_fragments):
fragment = splitted_mol_list[fragment_idx]
for atom in fragment.GetAtoms():
internal_atom_idx = int(re.split(r'(\d+)', atom.GetProp('atom_name'))[1]) - 1
if internal_atom_idx in core_atom_idx_list:
root_fragment_idx = fragment_idx
break

if root_fragment_idx is not None:
break

if root_fragment_idx is None:
raise ValueError('Bugs in root finding code for template docking!')

else:
root_fragment_idx = np.argmax(num_fragment_atoms_list)
##############################################################################

## Build torsion tree
### Add atom info into nodes
##############################################################################
torsion_tree = nx.Graph()
node_idx = 0
root_fragment = splitted_mol_list[root_fragment_idx]
num_root_atoms = root_fragment.GetNumAtoms()
atom_info_list = [None] * num_root_atoms

for root_atom_idx in range(num_root_atoms):
root_atom = root_fragment.GetAtomWithIdx(root_atom_idx)
atom_info_dict = {}
atom_info_dict['sdf_atom_idx'] = root_atom.GetIntProp('sdf_atom_idx')
atom_info_dict['atom_name'] = root_atom.GetProp('atom_name')
atom_info_dict['residue_name'] = root_atom.GetProp('residue_name')
atom_info_dict['chain_idx'] = root_atom.GetProp('chain_idx')
atom_info_dict['residue_idx'] = root_atom.GetIntProp('residue_idx')
atom_info_dict['x'] = root_atom.GetDoubleProp('x')
atom_info_dict['y'] = root_atom.GetDoubleProp('y')
atom_info_dict['z'] = root_atom.GetDoubleProp('z')
atom_info_dict['charge'] = root_atom.GetDoubleProp('charge')
atom_info_dict['atom_type'] = root_atom.GetProp('atom_type')

atom_info_list[root_atom_idx] = atom_info_dict

torsion_tree.add_node(node_idx, atom_info_list=atom_info_list)
node_idx += 1

for fragment_idx in range(num_fragments):
if fragment_idx == root_fragment_idx:
continue
else:
fragment = splitted_mol_list[fragment_idx]
num_fragment_atoms = fragment.GetNumAtoms()
atom_info_list = [None] * num_fragment_atoms

for atom_idx in range(num_fragment_atoms):
atom = fragment.GetAtomWithIdx(atom_idx)
atom_info_dict = {}
atom_info_dict['sdf_atom_idx'] = atom.GetIntProp('sdf_atom_idx')
atom_info_dict['atom_name'] = atom.GetProp('atom_name')
atom_info_dict['residue_name'] = atom.GetProp('residue_name')
atom_info_dict['chain_idx'] = atom.GetProp('chain_idx')
atom_info_dict['residue_idx'] = atom.GetIntProp('residue_idx')
atom_info_dict['x'] = atom.GetDoubleProp('x')
atom_info_dict['y'] = atom.GetDoubleProp('y')
atom_info_dict['z'] = atom.GetDoubleProp('z')
atom_info_dict['charge'] = atom.GetDoubleProp('charge')
atom_info_dict['atom_type'] = atom.GetProp('atom_type')

atom_info_list[atom_idx] = atom_info_dict

torsion_tree.add_node(node_idx, atom_info_list=atom_info_list)
node_idx += 1

##############################################################################

### Add edge info
##############################################################################
num_rotatable_bonds = len(rotatable_bond_info_list)
for edge_idx in range(num_rotatable_bonds):
rotatable_bond_info = rotatable_bond_info_list[edge_idx]
begin_atom_idx = rotatable_bond_info[0]
end_atom_idx = rotatable_bond_info[1]

begin_atom = mol.GetAtomWithIdx(begin_atom_idx)
begin_sdf_atom_idx = begin_atom.GetIntProp('sdf_atom_idx')
begin_atom_name = begin_atom.GetProp('atom_name')

end_atom = mol.GetAtomWithIdx(end_atom_idx)
end_sdf_atom_idx = end_atom.GetIntProp('sdf_atom_idx')
end_atom_name = end_atom.GetProp('atom_name')

begin_node_idx = None
end_node_idx = None
for node_idx in range(num_fragments):
atom_info_list = torsion_tree.nodes[node_idx]['atom_info_list']
for atom_info_dict in atom_info_list:
if atom_info_dict['atom_name'] == begin_atom_name:
begin_node_idx = node_idx
break
elif atom_info_dict['atom_name'] == end_atom_name:
end_node_idx = node_idx
break

if begin_node_idx is not None and end_node_idx is not None:
break

if begin_node_idx is None or end_node_idx is None:
raise ValueError('Bugs in edge assignment code!!')

torsion_tree.add_edge(begin_node_idx,
end_node_idx,
begin_node_idx=begin_node_idx,
end_node_idx=end_node_idx,
begin_sdf_atom_idx=begin_sdf_atom_idx,
end_sdf_atom_idx=end_sdf_atom_idx,
begin_atom_name=begin_atom_name,
end_atom_name=end_atom_name)

##############################################################################

self.torsion_tree = torsion_tree
self.mol = mol
代码
文本

完成了基於nx graph的torsion tree搭建之後,我們便可以著手來進行PDBQT格式的輸出。PDBQT格式是一種基於torsion tree,將每個fragments透過深度優先的搜索方式遍歷列出的一種定義,並且在此之外,每個原子紀錄的欄位還包含了對應的charge(Q)與atom type(T)等信息。 具體實現如下:

代码
文本

首先定義一個DFS的遞歸算法:

代码
文本
[10]
def __deep_first_search__(self, node_idx):
if node_idx == 0:
self.pdbqt_atom_line_list.append('ROOT\n')
atom_info_list = self.torsion_tree.nodes[node_idx]['atom_info_list']
for atom_info_dict in atom_info_list:
atom_name = atom_info_dict['atom_name']
self.atom_idx_info_mapping_dict[atom_name] = self.pdbqt_atom_idx
atom_info_tuple = ('ATOM',
self.pdbqt_atom_idx,
atom_info_dict['atom_name'],
atom_info_dict['residue_name'],
atom_info_dict['chain_idx'],
atom_info_dict['residue_idx'],
atom_info_dict['x'],
atom_info_dict['y'],
atom_info_dict['z'],
1.0,
0.0,
atom_info_dict['charge'],
atom_info_dict['atom_type'])

self.pdbqt_atom_line_list.append(self.pdbqt_atom_line_format.format(*atom_info_tuple))
self.pdbqt_atom_idx += 1

self.pdbqt_atom_line_list.append('ENDROOT\n')

else:
atom_info_list = self.torsion_tree.nodes[node_idx]['atom_info_list']
for atom_info_dict in atom_info_list:
atom_name = atom_info_dict['atom_name']
if atom_name not in self.atom_idx_info_mapping_dict:
self.atom_idx_info_mapping_dict[atom_name] = self.pdbqt_atom_idx

atom_info_tuple = ('ATOM',
self.pdbqt_atom_idx,
atom_info_dict['atom_name'],
atom_info_dict['residue_name'],
atom_info_dict['chain_idx'],
atom_info_dict['residue_idx'],
atom_info_dict['x'],
atom_info_dict['y'],
atom_info_dict['z'],
1.0,
0.0,
atom_info_dict['charge'],
atom_info_dict['atom_type'])

self.pdbqt_atom_line_list.append(self.pdbqt_atom_line_format.format(*atom_info_tuple))
self.pdbqt_atom_idx += 1

self.visited_node_idx_set.add(node_idx)

neighbor_node_idx_list = list(self.torsion_tree.neighbors(node_idx))
for neighbor_node_idx in neighbor_node_idx_list:
if neighbor_node_idx not in self.visited_node_idx_set:
temp_pdbqt_atom_idx = self.pdbqt_atom_idx
atom_info_list = self.torsion_tree.nodes[neighbor_node_idx]['atom_info_list']
for atom_info_dict in atom_info_list:
atom_name = atom_info_dict['atom_name']
if atom_name not in self.atom_idx_info_mapping_dict:
self.atom_idx_info_mapping_dict[atom_name] = temp_pdbqt_atom_idx
temp_pdbqt_atom_idx += 1

edge_info = self.torsion_tree.edges[(node_idx, neighbor_node_idx)]
begin_node_idx = edge_info['begin_node_idx']
end_node_idx = edge_info['end_node_idx']
begin_atom_name = edge_info['begin_atom_name']
end_atom_name = edge_info['end_atom_name']

if begin_node_idx == node_idx:
parent_atom_name = begin_atom_name
offspring_atom_name = end_atom_name
else:
parent_atom_name = end_atom_name
offspring_atom_name = begin_atom_name

parent_atom_idx = self.atom_idx_info_mapping_dict[parent_atom_name]
offspring_atom_idx = self.atom_idx_info_mapping_dict[offspring_atom_name]

self.branch_info_list.append((parent_atom_name, str(parent_atom_idx), offspring_atom_name, str(offspring_atom_idx)))
self.pdbqt_atom_line_list.append(self.pdbqt_branch_line_format.format('BRANCH', parent_atom_idx, offspring_atom_idx))
self.__deep_first_search__(neighbor_node_idx)
self.pdbqt_atom_line_list.append(self.pdbqt_end_branch_line_format.format('ENDBRANCH', parent_atom_idx, offspring_atom_idx))
代码
文本

其次是PDBQT格式輸出:

代码
文本
[11]
def write_pdbqt_file(self):
self.pdbqt_remark_line_list = []
self.pdbqt_atom_line_list = []

self.pdbqt_remark_torsion_line_format = '{:6s} {:^2d} {:1s} {:7s} {:6s} {:^7s} {:3s} {:^7s}\n'
self.pdbqt_atom_line_format = '{:4s} {:5d} {:^4s} {:3s} {:1s}{:4d} {:8.3f}{:8.3f}{:8.3f}{:6.2f}{:6.2f} {:6.3f} {:<2s}\n'
self.pdbqt_branch_line_format = '{:6s} {:3d} {:3d}\n'
self.pdbqt_end_branch_line_format = '{:9s} {:3d} {:3d}\n'
self.torsion_dof_line_format = '{:7s} {:d}'

## Prepare pdbqt atom lines
####################################################################################################
self.atom_idx_info_mapping_dict = {}
self.branch_info_list = []
self.visited_node_idx_set = set()
self.pdbqt_atom_idx = 1

self.__deep_first_search__(0)
self.num_torsions = len(self.branch_info_list)
self.pdbqt_atom_line_list.append(self.torsion_dof_line_format.format('TORSDOF', self.num_torsions))
####################################################################################################

## Prepare pdbqt remark lines
####################################################################################################
self.pdbqt_remark_line_list.append('REMARK ' + str(self.num_torsions) + ' active torsions:\n')
self.pdbqt_remark_line_list.append("REMARK status: ('A' for Active; 'I' for Inactive)\n")
for torsion_idx in range(self.num_torsions):
branch_info_tuple = self.branch_info_list[torsion_idx]
remark_torsion_info_tuple = ('REMARK',
torsion_idx+1,
'A',
'between',
'atoms:',
branch_info_tuple[0] + '_' + branch_info_tuple[1],
'and',
branch_info_tuple[2] + '_' + branch_info_tuple[3])

self.pdbqt_remark_line_list.append(self.pdbqt_remark_torsion_line_format.format(*remark_torsion_info_tuple))
####################################################################################################

self.pdbqt_line_list = self.pdbqt_remark_line_list + self.pdbqt_atom_line_list

with open(self.ligand_pdbqt_file_name, 'w') as ligand_pdbqt_file:
for pdbqt_line in self.pdbqt_line_list:
ligand_pdbqt_file.write(pdbqt_line)
代码
文本

以及unidock-style的sdf與constriant bpf文件輸出(這裡就只需要依賴torsion tree而不需要進行深度遍歷):

代码
文本
[12]
def write_constraint_bpf_file(self):
self.core_bpf_remark_line_list = []
self.core_bpf_atom_line_list = []
self.core_bpf_atom_line_format = '{:8.3f}\t{:8.3f}\t{:8.3f}\t{:6.2f}\t{:6.2f}\t{:3s}\t{:<2s}\n'

self.core_bpf_remark_line_list.append('x y z Vset r type atom\n')

root_atom_info_list = self.torsion_tree.nodes[0]['atom_info_list']
for atom_info_dict in root_atom_info_list:
atom_info_tuple = (atom_info_dict['x'],
atom_info_dict['y'],
atom_info_dict['z'],
-1.2,
0.6,
'map',
atom_info_dict['atom_type'])

self.core_bpf_atom_line_list.append(self.core_bpf_atom_line_format.format(*atom_info_tuple))

self.core_bpf_line_list = self.core_bpf_remark_line_list + self.core_bpf_atom_line_list

with open(self.ligand_core_bpf_file_name, 'w') as ligand_core_bpf_file:
for core_bpf_line in self.core_bpf_line_list:
ligand_core_bpf_file.write(core_bpf_line)

代码
文本
[13]
def write_torsion_tree_sdf_file(self):
fragment_info_string = ''
torsion_info_string = ''
atom_info_string = ''

num_nodes = self.torsion_tree.number_of_nodes()
num_edges = self.torsion_tree.number_of_edges()

for node_idx in range(num_nodes):
atom_info_list = self.torsion_tree.nodes[node_idx]['atom_info_list']
for atom_info_dict in atom_info_list:
fragment_info_string += str(atom_info_dict['sdf_atom_idx'])
fragment_info_string += ' '

fragment_info_string = fragment_info_string[:-1]
fragment_info_string += '\n'

edge_key_list = list(self.torsion_tree.edges.keys())
for edge_idx in range(num_edges):
edge_key = edge_key_list[edge_idx]
edge_info_dict = self.torsion_tree.edges[edge_key]
begin_sdf_atom_idx = str(edge_info_dict['begin_sdf_atom_idx'])
end_sdf_atom_idx = str(edge_info_dict['end_sdf_atom_idx'])
begin_node_idx = str(edge_info_dict['begin_node_idx'])
end_node_idx = str(edge_info_dict['end_node_idx'])

torsion_info_string += f'{begin_sdf_atom_idx} {end_sdf_atom_idx} {begin_node_idx} {end_node_idx}'
torsion_info_string += '\n'

atom_info_line_format = '{:<3d}{:<10f}{:<2s}\n'
for atom in self.mol.GetAtoms():
sdf_atom_idx = atom.GetIntProp('sdf_atom_idx')
charge = atom.GetDoubleProp('charge')
atom_type = atom.GetProp('atom_type')

atom_info_string += atom_info_line_format.format(sdf_atom_idx, charge, atom_type)

self.mol.SetProp('fragInfo', fragment_info_string)
self.mol.SetProp('torsionInfo', torsion_info_string)
self.mol.SetProp('atomInfo', atom_info_string)

writer = Chem.SDWriter(self.ligand_torsion_tree_sdf_file_name)
writer.write(self.mol)
writer.flush()
writer.close()

代码
文本

以上算法來自XDATools裡的autodock_topology模塊

代码
文本

一個AutoDock Torsion Tree Builder具體的操作方式如下,在此以constraint docking的準備為例:

ligand_sdf_file_name (str): 對接小分子的sdf文件路徑

covalent_ligand (bool): 是否為共價對接模式,進行共價分子的相關處理

template_docking (bool): 是否為限制對接模式,進行限制對接的前處理

reference_sdf_file_name (str): 共價對接場景時,需要指定對應的參考分子(或者core本身)的sdf文件路徑

working_dir_name (str): 工作文件夾路徑,輸出的文件將在此目錄底下

代码
文本
[14]
from xdatools.modules.docking_engines.autodock_topology.autodock_topology_builder import AutoDockTopologyBuilder
代码
文本
[15]
ligand_sdf_file_name = '/data/Projects/Docking_Protocol/Bace_watvina/CAT-13a.sdf'
reference_sdf_file_name = '/data/Projects/Docking_Protocol/Bace_watvina/core_test.sdf'
working_dir_name = '/data/Projects/Docking_Protocol/Bace_watvina/test_bohrium_notebook'
代码
文本
[16]
autodock_topology_builder = AutoDockTopologyBuilder(ligand_sdf_file_name,
covalent_ligand=False,
template_docking=True,
reference_sdf_file_name=reference_sdf_file_name,
working_dir_name=working_dir_name)

autodock_topology_builder.build_molecular_graph()
autodock_topology_builder.write_pdbqt_file()

autodock_topology_builder.write_constraint_bpf_file()
autodock_topology_builder.write_torsion_tree_sdf_file()
代码
文本
[19]
autodock_topology_builder.torsion_tree.nodes[0]
{'atom_info_list': [{'sdf_atom_idx': 1,
   'atom_name': 'C1',
   'residue_name': 'MOL',
   'chain_idx': 'A',
   'residue_idx': 1,
   'x': 11.719913055962122,
   'y': -2.2815467420095246,
   'z': 1.2427213030734408,
   'charge': 0.06388173328332081,
   'atom_type': 'C'},
  {'sdf_atom_idx': 2,
   'atom_name': 'N2',
   'residue_name': 'MOL',
   'chain_idx': 'A',
   'residue_idx': 1,
   'x': 13.039249084430867,
   'y': -2.604754947486575,
   'z': 0.7187088391966407,
   'charge': -0.2005718813405433,
   'atom_type': 'N'},
  {'sdf_atom_idx': 3,
   'atom_name': 'C3',
   'residue_name': 'MOL',
   'chain_idx': 'A',
   'residue_idx': 1,
   'x': 14.20655479727682,
   'y': -2.600233318784138,
   'z': 1.3698299001257184,
   'charge': 0.35138512187908655,
   'atom_type': 'C'},
  {'sdf_atom_idx': 4,
   'atom_name': 'N4',
   'residue_name': 'MOL',
   'chain_idx': 'A',
   'residue_idx': 1,
   'x': 14.387848666811841,
   'y': -2.3242533174310487,
   'z': 2.660938898880836,
   'charge': -0.2903100455230596,
   'atom_type': 'N'},
  {'sdf_atom_idx': 5,
   'atom_name': 'N5',
   'residue_name': 'MOL',
   'chain_idx': 'A',
   'residue_idx': 1,
   'x': 15.252166972622096,
   'y': -2.8978135205767206,
   'z': 0.559732952250487,
   'charge': -0.25765941719993657,
   'atom_type': 'N'},
  {'sdf_atom_idx': 6,
   'atom_name': 'C6',
   'residue_name': 'MOL',
   'chain_idx': 'A',
   'residue_idx': 1,
   'x': 14.789205160191587,
   'y': -3.205779802643188,
   'z': -0.824351179940522,
   'charge': 0.2011784616353723,
   'atom_type': 'C'},
  {'sdf_atom_idx': 21,
   'atom_name': 'C21',
   'residue_name': 'MOL',
   'chain_idx': 'A',
   'residue_idx': 1,
   'x': 13.25398440989743,
   'y': -2.959727738853404,
   'z': -0.6554318586945895,
   'charge': 0.3290140642479543,
   'atom_type': 'C'},
  {'sdf_atom_idx': 22,
   'atom_name': 'O22',
   'residue_name': 'MOL',
   'chain_idx': 'A',
   'residue_idx': 1,
   'x': 12.382982527915257,
   'y': -3.029754709991024,
   'z': -1.5000039153558906,
   'charge': -0.25127301662821194,
   'atom_type': 'OA'},
  {'sdf_atom_idx': 23,
   'atom_name': 'H23',
   'residue_name': 'MOL',
   'chain_idx': 'A',
   'residue_idx': 1,
   'x': 11.528434924137375,
   'y': -1.2156514623642218,
   'z': 1.0949600557942856,
   'charge': 0.06341809804586859,
   'atom_type': 'H'},
  {'sdf_atom_idx': 24,
   'atom_name': 'H24',
   'residue_name': 'MOL',
   'chain_idx': 'A',
   'residue_idx': 1,
   'x': 10.942831600844023,
   'y': -2.858284709474195,
   'z': 0.7334266985609983,
   'charge': 0.06341809804586859,
   'atom_type': 'H'},
  {'sdf_atom_idx': 25,
   'atom_name': 'H25',
   'residue_name': 'MOL',
   'chain_idx': 'A',
   'residue_idx': 1,
   'x': 11.666536485970337,
   'y': -2.511964877099178,
   'z': 2.310545602979936,
   'charge': 0.06341809804586859,
   'atom_type': 'H'},
  {'sdf_atom_idx': 26,
   'atom_name': 'H26',
   'residue_name': 'MOL',
   'chain_idx': 'A',
   'residue_idx': 1,
   'x': 15.31418577577549,
   'y': -2.3496454216387153,
   'z': 3.071004645599471,
   'charge': 0.25629660199432974,
   'atom_type': 'HD'},
  {'sdf_atom_idx': 38,
   'atom_name': 'H38',
   'residue_name': 'MOL',
   'chain_idx': 'A',
   'residue_idx': 1,
   'x': 13.624643136458346,
   'y': -2.064145272316698,
   'z': 3.276411784347618,
   'charge': 0.25629660199432974,
   'atom_type': 'HD'},
  {'sdf_atom_idx': 39,
   'atom_name': 'H39',
   'residue_name': 'MOL',
   'chain_idx': 'A',
   'residue_idx': 1,
   'x': 16.214627641316802,
   'y': -2.882153343686885,
   'z': 0.8585917895196787,
   'charge': 0.2612133963316691,
   'atom_type': 'HD'}]}
代码
文本
[20]
autodock_topology_builder.torsion_tree.edges[(0, 1)]
{'begin_node_idx': 0,
 'end_node_idx': 1,
 'begin_sdf_atom_idx': 6,
 'end_sdf_atom_idx': 7,
 'begin_atom_name': 'C6',
 'end_atom_name': 'C7'}
代码
文本

Docking Protocol

代码
文本

此部分為正式的Demo案例,演示了如何將上述的torsion tree準備融合進docking的工作流程中。 在一個易用的docking protocol中,應該包含至少四個模塊:

  1. 蛋白文件準備: 包含可能的格式標準化(這裡不演示),以及蛋白PDBQT的輸入文件準備
  2. 小分子torsion tree輸入文件準備(PDBQT or Uni-Dock style sdf文件)
  3. Docking Engine: 具體即是調用各種可能的docking軟件進行計算,例如AD-GPU, Vina, Uni-Dock, watvina等等
  4. Docking pose parsing: 不同軟件的輸出格式不同,需要針對個別軟件去實現對應的parser,並整理成統一的固定的輸出格式,並產生彙總的表格。

將以下各功能實現完畢,並整合進一個統一模塊中,我們這裡以XDATools的實現為範例(後續這套算法將取代uni-ligprep + unidock的流程,成為hermite或launching上新的統一流程)

AutoDockRunner的使用方式如下:

代码
文本

本範例使用一個FEP標準測試集中的Bace體系,共36個分子,進行constraint docking作為範例,docking engine以watvina為例:

ps. Uni-Dock目前在constraint docking上的表現不佳,由於算法原因,core容易固定不住,需要持續優化,因此這裡使用watvina為舉例。

代码
文本

Imports

代码
文本

引入依賴以及添加各種牛鬼蛇神的環境變量:

docking的前後處理以及運行流程都透過引入一個 AutoDockRunner 解決。

代码
文本
[1]
import os
os.chdir('/data/Projects/Docking_Protocol/Bace_watvina/test_bohrium_notebook')

import shutil
from time import time

import numpy as np
import pandas as pd

from rdkit import Chem

os.environ['PATH'] = '/data/Modules/Uni-Dock:/data/Modules/watvina:' + os.environ['PATH']
os.environ['LD_LIBRARY_PATH'] = '/data/Modules/ADFR/lib'
os.environ['MGLPY'] = '/data/Modules/ADFR/bin/python'
os.environ['MGLUTIL'] = '/data/Modules/ADFR/CCSBpckgs/AutoDockTools/Utilities24'
os.environ['WATVINA'] = '/data/Modules/watvina/pywatvina/watvina'

from xdatools.modules.docking_engines.autodock_runner import AutoDockRunner

import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline
matplotlib.rcParams['figure.dpi'] = 150
代码
文本

準備輸入文件,包含標準化過後的蛋白pdb文件、口袋中心座標與需要對接的小分子sdf文件,以及tmeplate docking對應的參考分子:

代码
文本
[2]
ligand_sdf_file_name_list = []

for file_name in os.listdir('/data/Projects/Docking_Protocol/Bace_watvina'):
if file_name.startswith('CAT') and file_name.endswith('sdf'):
ligand_sdf_file_name_list.append(os.path.join('/data/Projects/Docking_Protocol/Bace_watvina', file_name))

protein_pdb_file_name_list = ['/data/Projects/Docking_Protocol/Bace_watvina/receptor.pdb']
reference_sdf_file_name = '/data/Projects/Docking_Protocol/Bace_watvina/core.sdf'
target_center_list = [(14.786, -0.626, -1.088)]
代码
文本

目前AutoDockRunner的主要輸入如下:

ligand_sdf_file_name_list: 小分子的sdf路徑列表,支持一個sdf包含多個小分子構像,同時支持多個sdf文件並行處理

protein_pdb_file_name_list: 蛋白pdb文件列表,默認支持ensemble docking,可以一次對多個receptor構像進行探索

target_center_list: 蛋白口袋中心座標列表,與protein_pdb_file_name_list一一對應

box_size: docking使用的盒子大小

docking_engine: 使用的對接軟件,支持['ad-gpu', 'adfr', 'unidock', 'watvina']

covalent_ligand: 若對接模式為共價對接,則指定為True

template_docking: 若對接模式為core constraint docking,則指定為True

reference_sdf_file_name: 對接模式為core constriant docking時,需要指定參考分子的sdf文件路徑,此參考分子可以是對接分子的core,也可以是某一個同系列陽參,程序會自動尋找MCS匹配。暫不支持core-hopping場景。

generate_torsion_tree_sdf: 若為True,則torsion tree程序會產生Uni-Dock所定義的帶有torsion tree信息的sdf文件格式,用於Uni-Dock中sdf I/O的場景。

n_cpu: 使用的cpu並行核數,建議根據配置機器的cpu核數量指定

num_docking_runs: 指定engine產生的docking pose數量上限

working_dir_name: 工作文件路徑,所有AutoDockRunner的輸出都會在此路徑,根據流程產生四個子文件夾

代码
文本
[3]
start = time()

os.mkdir('pywatvina')

autodock_runner = AutoDockRunner(ligand_sdf_file_name_list,
protein_pdb_file_name_list,
protein_conf_name_list=None,
target_center_list=target_center_list,
box_size=(22.5, 22.5, 22.5),
docking_engine='watvina',
covalent_ligand=False,
template_docking=True,
reference_sdf_file_name=reference_sdf_file_name,
generate_torsion_tree_sdf=True,
n_cpu=12,
num_docking_runs=10,
working_dir_name='/data/Projects/Docking_Protocol/Bace_watvina/test_bohrium_notebook/pywatvina')

docking_pose_summary_info_list = autodock_runner.run()

end = time()
print(end - start)
/opt/mamba/lib/python3.9/site-packages/MDAnalysis/topology/PDBParser.py:331: UserWarning: Element information is missing, elements attribute will not be populated. If needed these can be guessed using MDAnalysis.topology.guessers.
  warnings.warn("Element information is missing, elements attribute "
/opt/mamba/lib/python3.9/site-packages/MDAnalysis/coordinates/PDB.py:451: UserWarning: 1 A^3 CRYST1 record, this is usually a placeholder. Unit cell dimensions will be set to None.
  warnings.warn("1 A^3 CRYST1 record,"
/opt/mamba/lib/python3.9/site-packages/MDAnalysis/coordinates/PDB.py:775: UserWarning: Unit cell dimensions not found. CRYST1 record set to unitary values.
  warnings.warn("Unit cell dimensions not found. "
/opt/mamba/lib/python3.9/site-packages/MDAnalysis/coordinates/PDB.py:1151: UserWarning: Found no information for attr: 'elements' Using default value of ' '
  warnings.warn("Found no information for attr: '{}'"
/opt/mamba/lib/python3.9/site-packages/MDAnalysis/coordinates/PDB.py:1151: UserWarning: Found no information for attr: 'formalcharges' Using default value of '0'
  warnings.warn("Found no information for attr: '{}'"
Computing WATVina grid ... done.
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -4.59   0.00   2.41  -17.36   -1.48   -3.92    0.00   -0.07    0.00
!  2  -4.52   2.59   0.32  -13.80   -1.48   -4.54    0.00   -0.41    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -4.77   0.00   2.31  -19.07   -1.48   -4.29    0.00   -0.26    0.09
!  2  -4.69   2.50   0.44  -15.66   -1.48   -4.71    0.00   -0.50    0.01
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -4.89   0.00   2.33  -20.25   -1.48   -4.35    0.00   -0.42    0.09
!  2  -4.71   2.48   0.43  -16.36   -1.48   -4.71    0.00   -0.66    0.02
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.17   0.00   2.33  -20.27   -1.48   -4.41    0.00   -0.26    0.01
!  2  -5.01   2.52   0.45  -15.87   -1.48   -4.83    0.00   -0.69    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.36   0.00   2.30  -21.09   -1.48   -4.65    0.00   -0.40    0.00
!  2  -4.81   2.39   0.53  -16.71   -1.48   -5.11    0.00   -0.41    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.70   0.00   2.30  -22.78   -1.48   -5.01    0.00   -0.36    0.01
!  2  -5.50   2.45   0.30  -18.49   -1.48   -4.98    0.00   -0.79    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -6.02   0.00   2.27  -24.40   -1.48   -5.19    0.00   -0.50    0.00
!  2  -5.07   2.32   0.66  -20.07   -1.48   -5.61    0.00   -0.09    0.00
!  3  -4.81   0.61   2.21  -19.92   -1.48   -4.60    0.00   -0.15    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.25   0.00   2.30  -21.24   -1.48   -4.43    0.00   -0.25    0.00
!  2  -5.18   2.48   0.40  -17.48   -1.48   -4.94    0.00   -0.65    0.01
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.60   0.00   2.32  -22.61   -1.48   -5.28    0.00   -0.44    0.01
!  2  -5.00   2.50   0.32  -17.04   -1.48   -5.34    0.00   -0.65    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.39   0.00   2.28  -21.94   -1.48   -4.16    0.00   -0.25    0.00
!  2  -5.36   2.40   0.61  -18.22   -1.62   -4.40    0.00   -0.60    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.16   0.00   0.28  -18.68   -1.62   -4.83    0.00   -0.69    0.00
!  2  -5.13   2.29   2.13  -22.02   -1.66   -4.20    0.00   -0.26    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.00   0.00   0.29  -18.23   -1.58   -4.73    0.00   -0.75    0.00
!  2  -4.98   2.29   2.12  -21.49   -1.61   -4.11    0.00   -0.36    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.46   0.00   2.20  -21.95   -1.60   -3.71    0.00   -0.16    0.00
!  2  -5.44   2.32   0.34  -18.23   -1.59   -4.30    0.00   -0.58    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.41   0.00   2.31  -22.74   -1.52   -3.52    0.00   -0.40    0.00
!  2  -5.32   2.41   0.76  -18.53   -1.59   -4.18    0.00   -0.65    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.01   0.00   0.42  -18.05   -1.48   -4.96    0.00   -0.71    0.00
!  2  -4.86   2.92   2.74  -20.92   -1.48   -4.42    0.00   -0.26    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -3.60   0.00   0.44  -12.39   -1.52   -3.85    0.00   -0.80    0.17
!  2  -2.73   1.62   1.68  -15.65   -0.24   -0.69    0.00   -0.71    0.19
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.03   0.00   2.47  -20.09   -1.48   -3.56    0.00   -0.17    0.00
!  2  -4.56   2.66   0.33  -13.51   -1.48   -3.61    0.00   -0.51    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -4.92   0.00   0.28  -18.65   -1.48   -3.61    0.00   -0.52    0.00
!  2  -4.62   2.43   2.38  -19.31   -1.51   -3.52    0.00   -0.22    0.00
!  3  -4.20   2.01   1.82  -16.32   -1.48   -4.00    0.00   -0.25    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.09   0.00   0.35  -21.20   -1.55   -3.11    0.00   -0.78    0.04
!  2  -3.02   1.69   1.71  -17.93   -0.33   -0.04    0.00   -0.62    0.18
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -4.76   0.00   0.12  -18.92   -1.49   -2.10    0.00   -0.53    0.00
!  2  -4.71   2.93   2.91  -20.01   -1.51   -3.36    0.00   -0.22    0.00
!  3  -4.54   1.96   1.86  -15.66   -1.88   -3.92    0.00   -0.35    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.05   0.00   0.33  -19.48   -1.51   -3.37    0.00   -0.52    0.00
!  2  -4.65   2.93   2.82  -19.55   -1.51   -3.52    0.00   -0.22    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.35   0.00   0.26  -21.06   -1.48   -3.75    0.00   -0.53    0.00
!  2  -4.81   2.98   2.89  -20.33   -1.51   -3.68    0.00   -0.23    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -4.89   0.00   0.33  -19.01   -1.47   -3.67    0.00   -0.74    0.00
!  2  -4.00   1.67   1.69  -21.86   -0.27   -0.51    0.00   -0.65    0.01
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -4.15   0.00   0.32  -14.14   -1.47   -3.92    0.00   -0.71    0.01
!  2  -3.32   1.59   1.62  -17.44   -0.25   -0.76    0.00   -0.63    0.01
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.22   0.00   0.29  -19.46   -1.48   -4.55    0.00   -0.66    0.00
!  2  -4.72   2.22   2.15  -20.14   -1.48   -4.21    0.00   -0.30    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -4.03   0.00   0.30  -14.91   -1.53   -4.05    0.00   -0.75    0.16
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -4.09   0.00   0.41  -14.93   -1.53   -3.81    0.00   -0.70    0.15
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -3.94   0.00   0.59  -15.00   -1.48   -4.20    0.00   -0.83    0.17
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.40   0.00   0.28  -21.17   -1.48   -4.35    0.00   -0.66    0.00
!  2  -4.63   2.26   2.22  -20.25   -1.48   -3.29    0.00   -0.31    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.36   0.00   0.27  -21.33   -1.48   -3.78    0.00   -0.66    0.00
!  2  -4.74   2.32   2.28  -20.68   -1.51   -3.37    0.00   -0.31    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.21   0.00   0.39  -20.00   -1.48   -3.98    0.00   -0.65    0.00
!  2  -4.66   2.38   2.24  -19.90   -1.49   -3.94    0.00   -0.30    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.25   0.00   0.38  -20.62   -1.49   -3.79    0.00   -0.65    0.00
!  2  -4.59   2.36   2.32  -20.04   -1.48   -3.23    0.00   -0.31    0.00
!  3  -4.31   1.81   1.54  -17.73   -1.48   -3.80    0.00   -0.27    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.30   0.00   0.11  -21.47   -1.48   -3.86    0.00   -0.70    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.20   0.00   0.16  -21.58   -1.49   -2.33    0.00   -0.67    0.00
!  2  -4.83   1.87   1.75  -18.08   -1.89   -3.99    0.00   -0.38    0.00
!  3  -4.82   2.80   2.78  -21.34   -1.51   -3.20    0.00   -0.31    0.00
!  4  -3.21   0.69   0.83   -9.45   -1.48   -3.31    0.00   -0.56    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
+-----------+-------------+------------------------------------------------+
| summary   |   RMSD_TO   |               Score contribution               |
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
|No.| score | best | init |  VDW  | HBond | Elect | Desol | Intra | Torsion|
+---+-------+------+------+-------+-------+-------+-------+-------+--------+
!  1  -5.50   0.00   0.31  -22.17   -1.51   -3.54    0.00   -0.67    0.00
!  2  -4.79   2.81   2.71  -21.00   -1.52   -3.36    0.00   -0.31    0.00
Using random seed: -130011850
Performing search ... 
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
46.6844437122345
代码
文本

在此範例中只列出了AutoDockRunner中適用於template docking的選項,對於free docking或covalent docking,有各自不同的選項需要留意,詳情可見docstring

代码
文本

AutoDockRunner的輸出匯總為一個dataframe,這裡簡單查看對接結果與相關分析作圖:

代码
文本
[4]
docking_pose_summary_info_df = docking_pose_summary_info_list[0]
docking_pose_summary_info_df
ligand_docked_sdf_file_name conformer_energy binding_free_energy ligand_efficiency ligand_smiles_string ligand_original_sdf_file_name
0 /data/Projects/Docking_Protocol/Bace_watvina/t... -67.901642 -4.6 -0.209091 CN1C(=O)[C@@](C)(c2cccc(-c3cccc(Cl)c3)c2)[NH+]... /data/Projects/Docking_Protocol/Bace_watvina/C...
1 /data/Projects/Docking_Protocol/Bace_watvina/t... -71.584129 -4.5 -0.204545 CN1C(=O)[C@@](C)(c2cccc(-c3cccc(Cl)c3)c2)[NH+]... /data/Projects/Docking_Protocol/Bace_watvina/C...
2 /data/Projects/Docking_Protocol/Bace_watvina/t... -66.151817 -4.8 -0.208696 CC[C@]1(c2cccc(-c3cccc(Cl)c3)c2)[NH+]=C(N)N(C)... /data/Projects/Docking_Protocol/Bace_watvina/C...
3 /data/Projects/Docking_Protocol/Bace_watvina/t... -68.893745 -4.7 -0.204348 CC[C@]1(c2cccc(-c3cccc(Cl)c3)c2)[NH+]=C(N)N(C)... /data/Projects/Docking_Protocol/Bace_watvina/C...
4 /data/Projects/Docking_Protocol/Bace_watvina/t... -62.776211 -4.9 -0.204167 CC(C)[C@]1(c2cccc(-c3cccc(Cl)c3)c2)[NH+]=C(N)N... /data/Projects/Docking_Protocol/Bace_watvina/C...
... ... ... ... ... ... ...
73 /data/Projects/Docking_Protocol/Bace_watvina/t... -21.592833 -5.5 -0.203704 CN1C(=O)[C@@](c2ccccc2)(c2cccc(-c3cncc(F)c3)c2... /data/Projects/Docking_Protocol/Bace_watvina/C...
74 /data/Projects/Docking_Protocol/Bace_watvina/t... -16.771217 -4.8 -0.177778 CN1C(=O)[C@@](c2ccccc2)(c2cccc(-c3cncc(F)c3)c2... /data/Projects/Docking_Protocol/Bace_watvina/C...
75 /data/Projects/Docking_Protocol/Bace_watvina/t... -20.970114 -5.8 -0.214815 CN1C(=O)[C@@](c2ccccc2)(c2cccc(-c3cncc(Cl)c3)c... /data/Projects/Docking_Protocol/Bace_watvina/C...
76 /data/Projects/Docking_Protocol/Bace_watvina/t... -16.169964 -4.9 -0.181481 CN1C(=O)[C@@](c2ccccc2)(c2cccc(-c3cncc(Cl)c3)c... /data/Projects/Docking_Protocol/Bace_watvina/C...
77 /data/Projects/Docking_Protocol/Bace_watvina/t... -14.723411 -4.6 -0.170370 CN1C(=O)[C@@](c2ccccc2)(c2cccc(-c3cncc(Cl)c3)c... /data/Projects/Docking_Protocol/Bace_watvina/C...

78 rows × 6 columns

代码
文本
[5]
ligand_efficiency_array = docking_pose_summary_info_df.loc[:, 'ligand_efficiency'].values

fig, ax = plt.subplots()
ax.hist(ligand_efficiency_array, bins=100, color='salmon', label='ligand_efficiency')
ax.legend()
plt.xlabel(u'ligand_efficiency (kcal/mol)')
plt.ylabel(u'Number of Conformations')
plt.xlim(-1.0, 0.0)
plt.show()
代码
文本

未完待續.....後續一般包含聚類與相互作用分析(IFP)等功能,完善虛篩流程。

代码
文本
化学信息学
RDKit
Uni-Dock
watvina
Virtual Screening
化学信息学RDKitUni-DockwatvinaVirtual Screening
已赞2
本文被以下合集收录
分子对接
小炒砂糖桔
更新于 2023-12-18
6 篇0 人关注
推荐阅读
公开
TeachOpenCADD | 012 从PubChem获取数据
化学信息学TeachOpenCADD
化学信息学TeachOpenCADD
YangHe
发布于 2023-06-26
1 转存文件
公开
Re: 从零开始的电子结构生活(二)RHF 和 UHF 框架下氢分子的解离势能面
中文量子化学电子结构
中文量子化学电子结构
Weiliang Luo
发布于 2024-01-08
6 赞2 转存文件