Bohrium
robot
新建

空间站广场

论文
Notebooks
比赛
课程
Apps
我的主页
我的Notebooks
我的论文库
我的足迹

我的工作空间

任务
节点
文件
数据集
镜像
项目
数据库
公开
Jupyter_Dock | 分子对接
化学信息学
Jupyter_Dock
分子对接
化学信息学Jupyter_Dock分子对接
liyq@dp.tech
发布于 2023-06-15
赞 1
9
AI4SCUP-CNS-BBB(v1)

🏃🏻 快速开始
您可以直接在 Bohrium Notebook 上执行此文档。首先,请点击位于界面顶部的 开始连接 按钮,然后选择 ubuntu:22.04-py3.10 镜像并选择合适的的机器配置,稍等片刻即可开始运行。

📖 来源
本 Notebook 来自 https://github.com/AngelRuizMoreno/Jupyter_Dock,由李亚奇 📨 修改搬运至 Bohrium Notebook。

代码
文本

🎯 本教程旨在快速掌握 Jupyter Dock 的用法:分子对接。

  • 一键运行,你可以快速在实践中检验你的想法。

  • 丰富完善的注释,对于入门者友好。

代码
文本

那么就让我们开始吧

代码
文本

首先,我们将配置运行环境,并下载所需要的文件以及赋予相关软件可执行权限

代码
文本
[1]
!mamba install -c conda-forge pymol-open-source py3dmol openbabel pdbfixer vina rdkit cython -y --quiet
代码
文本
[2]
!apt-get update && apt-get upgrade -y && apt-get install libgl1-mesa-glx -y
Hit:1 http://archive.ubuntu.com/ubuntu jammy InRelease
Get:2 http://security.ubuntu.com/ubuntu jammy-security InRelease [110 kB]
Get:3 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [119 kB]
Get:4 http://archive.ubuntu.com/ubuntu jammy-backports InRelease [108 kB]
Fetched 337 kB in 2s (152 kB/s)                                  
Reading package lists... Done
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Calculating upgrade... Done
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
libgl1-mesa-glx is already the newest version (22.2.5-0ubuntu0.1~22.04.1).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
代码
文本
[3]
!pip install git+https://github.com/chemosim-lab/ProLIF.git && pip install meeko
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting git+https://github.com/chemosim-lab/ProLIF.git
  Cloning https://github.com/chemosim-lab/ProLIF.git to /tmp/pip-req-build-i7onjrl1
  Running command git clone --filter=blob:none --quiet https://github.com/chemosim-lab/ProLIF.git /tmp/pip-req-build-i7onjrl1
  Resolved https://github.com/chemosim-lab/ProLIF.git to commit 3b261bde1c1c6a16771ee49b7563e82a47de35be
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Installing backend dependencies ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: dill in /opt/mamba/lib/python3.10/site-packages (from prolif==2.0.0.dev0) (0.3.6)
Requirement already satisfied: scipy>=1.3.0 in /opt/mamba/lib/python3.10/site-packages (from prolif==2.0.0.dev0) (1.10.1)
Requirement already satisfied: numpy>=1.13.3 in /opt/mamba/lib/python3.10/site-packages (from prolif==2.0.0.dev0) (1.24.3)
Requirement already satisfied: multiprocess in /opt/mamba/lib/python3.10/site-packages (from prolif==2.0.0.dev0) (0.70.14)
Requirement already satisfied: tqdm in /opt/mamba/lib/python3.10/site-packages (from prolif==2.0.0.dev0) (4.64.1)
Requirement already satisfied: pandas>=1.0.0 in /opt/mamba/lib/python3.10/site-packages (from prolif==2.0.0.dev0) (2.0.2)
Requirement already satisfied: mdanalysis>=2.2.0 in /opt/mamba/lib/python3.10/site-packages (from prolif==2.0.0.dev0) (2.5.0)
Requirement already satisfied: matplotlib>=1.5.1 in /opt/mamba/lib/python3.10/site-packages (from mdanalysis>=2.2.0->prolif==2.0.0.dev0) (3.7.1)
Requirement already satisfied: mmtf-python>=1.0.0 in /opt/mamba/lib/python3.10/site-packages (from mdanalysis>=2.2.0->prolif==2.0.0.dev0) (1.1.3)
Requirement already satisfied: fasteners in /opt/mamba/lib/python3.10/site-packages (from mdanalysis>=2.2.0->prolif==2.0.0.dev0) (0.18)
Requirement already satisfied: gsd>=1.9.3 in /opt/mamba/lib/python3.10/site-packages (from mdanalysis>=2.2.0->prolif==2.0.0.dev0) (2.9.0)
Requirement already satisfied: joblib>=0.12 in /opt/mamba/lib/python3.10/site-packages (from mdanalysis>=2.2.0->prolif==2.0.0.dev0) (1.2.0)
Requirement already satisfied: threadpoolctl in /opt/mamba/lib/python3.10/site-packages (from mdanalysis>=2.2.0->prolif==2.0.0.dev0) (3.1.0)
Requirement already satisfied: GridDataFormats>=0.4.0 in /opt/mamba/lib/python3.10/site-packages (from mdanalysis>=2.2.0->prolif==2.0.0.dev0) (1.0.1)
Requirement already satisfied: networkx>=2.0 in /opt/mamba/lib/python3.10/site-packages (from mdanalysis>=2.2.0->prolif==2.0.0.dev0) (3.1)
Requirement already satisfied: packaging in /opt/mamba/lib/python3.10/site-packages (from mdanalysis>=2.2.0->prolif==2.0.0.dev0) (23.1)
Requirement already satisfied: biopython>=1.80 in /opt/mamba/lib/python3.10/site-packages (from mdanalysis>=2.2.0->prolif==2.0.0.dev0) (1.81)
Requirement already satisfied: pytz>=2020.1 in /opt/mamba/lib/python3.10/site-packages (from pandas>=1.0.0->prolif==2.0.0.dev0) (2023.3)
Requirement already satisfied: python-dateutil>=2.8.2 in /opt/mamba/lib/python3.10/site-packages (from pandas>=1.0.0->prolif==2.0.0.dev0) (2.8.2)
Requirement already satisfied: tzdata>=2022.1 in /opt/mamba/lib/python3.10/site-packages (from pandas>=1.0.0->prolif==2.0.0.dev0) (2023.3)
Requirement already satisfied: mrcfile in /opt/mamba/lib/python3.10/site-packages (from GridDataFormats>=0.4.0->mdanalysis>=2.2.0->prolif==2.0.0.dev0) (1.4.3)
Requirement already satisfied: contourpy>=1.0.1 in /opt/mamba/lib/python3.10/site-packages (from matplotlib>=1.5.1->mdanalysis>=2.2.0->prolif==2.0.0.dev0) (1.1.0)
Requirement already satisfied: pillow>=6.2.0 in /opt/mamba/lib/python3.10/site-packages (from matplotlib>=1.5.1->mdanalysis>=2.2.0->prolif==2.0.0.dev0) (9.5.0)
Requirement already satisfied: fonttools>=4.22.0 in /opt/mamba/lib/python3.10/site-packages (from matplotlib>=1.5.1->mdanalysis>=2.2.0->prolif==2.0.0.dev0) (4.40.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /opt/mamba/lib/python3.10/site-packages (from matplotlib>=1.5.1->mdanalysis>=2.2.0->prolif==2.0.0.dev0) (1.4.4)
Requirement already satisfied: pyparsing>=2.3.1 in /opt/mamba/lib/python3.10/site-packages (from matplotlib>=1.5.1->mdanalysis>=2.2.0->prolif==2.0.0.dev0) (3.0.9)
Requirement already satisfied: cycler>=0.10 in /opt/mamba/lib/python3.10/site-packages (from matplotlib>=1.5.1->mdanalysis>=2.2.0->prolif==2.0.0.dev0) (0.11.0)
Requirement already satisfied: msgpack>=1.0.0 in /opt/mamba/lib/python3.10/site-packages (from mmtf-python>=1.0.0->mdanalysis>=2.2.0->prolif==2.0.0.dev0) (1.0.5)
Requirement already satisfied: six>=1.5 in /opt/mamba/lib/python3.10/site-packages (from python-dateutil>=2.8.2->pandas>=1.0.0->prolif==2.0.0.dev0) (1.16.0)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Requirement already satisfied: meeko in /opt/mamba/lib/python3.10/site-packages (0.4.0)
Requirement already satisfied: numpy>=1.18 in /opt/mamba/lib/python3.10/site-packages (from meeko) (1.24.3)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
代码
文本
[1]
!git clone https://github.com/AngelRuizMoreno/Jupyter_Dock
Cloning into 'Jupyter_Dock'...
remote: Enumerating objects: 2177, done.
remote: Counting objects: 100% (57/57), done.
remote: Compressing objects: 100% (29/29), done.
remote: Total 2177 (delta 39), reused 30 (delta 28), pack-reused 2120
Receiving objects: 100% (2177/2177), 30.84 MiB | 3.16 MiB/s, done.
Resolving deltas: 100% (1664/1664), done.
Updating files: 100% (596/596), done.
代码
文本

由于ledock 和 lepro 存在使用期限的限制,因此下载最新的版本以替换老旧的文件

代码
文本
[2]
!wget http://www.lephar.com/download/lepro_linux_x86 -O ./Jupyter_Dock/bin/lepro_linux_x86
--2023-06-15 13:03:53--  http://www.lephar.com/download/lepro_linux_x86
Resolving ga.dp.tech (ga.dp.tech)... 10.255.255.41
Connecting to ga.dp.tech (ga.dp.tech)|10.255.255.41|:8118... connected.
Proxy request sent, awaiting response... 200 OK
Length: 393683 (384K) [application/octet-stream]
Saving to: ‘./Jupyter_Dock/bin/lepro_linux_x86’

./Jupyter_Dock/bin/ 100%[===================>] 384.46K  16.3KB/s    in 18s     

2023-06-15 13:04:12 (21.8 KB/s) - ‘./Jupyter_Dock/bin/lepro_linux_x86’ saved [393683/393683]

代码
文本
[3]
!wget http://www.lephar.com/download/ledock_linux_x86 -O ./Jupyter_Dock/bin/ledock_linux_x86
--2023-06-15 13:04:13--  http://www.lephar.com/download/ledock_linux_x86
Resolving ga.dp.tech (ga.dp.tech)... 10.255.255.41
Connecting to ga.dp.tech (ga.dp.tech)|10.255.255.41|:8118... connected.
Proxy request sent, awaiting response... 200 OK
Length: 362677 (354K) [application/octet-stream]
Saving to: ‘./Jupyter_Dock/bin/ledock_linux_x86’

./Jupyter_Dock/bin/ 100%[===================>] 354.18K  19.7KB/s    in 23s     

2023-06-15 13:04:37 (15.4 KB/s) - ‘./Jupyter_Dock/bin/ledock_linux_x86’ saved [362677/362677]

代码
文本
[4]
!chmod +x -R ./Jupyter_Dock/bin
代码
文本

导入所需要的环境

代码
文本
[4]
from pymol import cmd
import py3Dmol

from vina import Vina

from openbabel import pybel

from rdkit import Chem
from rdkit.Chem import AllChem, Draw

from meeko import MoleculePreparation
from meeko import obutils

import MDAnalysis as mda
from MDAnalysis.coordinates import PDB

import prolif as plf
from prolif.plotting.network import LigNetwork


import sys, os
sys.path.insert(1, 'utilities/')
from Jupyter_Dock.utilities.utils import fix_protein, getbox, generate_ledock_file, pdbqt_to_sdf, dok_to_sdf


import warnings
warnings.filterwarnings("ignore")
%config Completer.use_jedi = False
代码
文本

在尝试自定义参数之前,运行测试体系是个好主意。之后,用户可以指定自己输入文件的位置并将所有文件分别保存。

代码
文本
[6]
os.chdir('Jupyter_Dock/test/Molecular_Docking/')
代码
文本
[8]
!rm -rf ./*
代码
文本

1. 准备对接体系

在这里我们将演示如何使用pymol的cmd来下载相关的体系PDB文件。当然,除了使用pymol以外还有很多种方式可以下载。

我们将下载1AZ8做为对接示范体系

代码
文本
[9]
cmd.fetch(code='1AZ8',type='pdb1')
cmd.select(name='Prot',selection='polymer.protein')
cmd.select(name='Lig',selection='organic')
cmd.save(filename='1AZ8_clean.pdb',format='pdb',selection='Prot')
cmd.save(filename='1AZ8_lig.mol2',format='mol2',selection='Lig')
cmd.delete('all')
代码
文本

可以看到,我们已经成功下载了所需要的体系

代码
文本
[10]
!ls
1AZ8_clean.pdb	1AZ8_lig.mol2  1az8.pdb1
代码
文本

2. 蛋白与配体预处理

代码
文本

2.1. 蛋白预处理

代码
文本

方法 1: LePro

Lephar分子对接软件包含一个非常强大的工具,用于自动准备蛋白质进行分子对接。因此,在我们的工作流程中,它将成为蛋白质准备的首选工具。经过LePro处理的蛋白质可以在AutoDock Vina和LeDock中使用。

代码
文本
[22]
!../../bin/lepro_linux_x86
************************************************************
*      LePro                                               *
*            Add hydrogen atoms to a protein &             *
*            write the input file for LeDock               *
*      Copyright (C) 2013-21 Hongtao Zhao, PhD             *
*      Email: htzhaovv@gmail.com                           *
************************************************************
----------Usage:                                                                       
          lepro [PDB file] [-rot || -metal || -p]                                        
          -rot  [[chain] resid] align principal axes of the binding site with Cartesian
          -metal keep ZN/MN/CA/MG                                                      
          -metal -p redistribute metal charge to protein                               

代码
文本
[15]
!../../bin/lepro_linux_x86 1AZ8_clean.pdb

os.rename('pro.pdb','1AZ8_clean_H.pdb') # lepro的输出文件是 pro.pdb, 这里会将其重命名为 '1AZ8_clean_H.pdb'
代码
文本

方法 2: fix_protein (PDBFixer)

对于具有缺失氨基酸或残基的蛋白质,或者为了确保更全面的蛋白质预处理,Jupyter Dock包括了_fix_protein()_函数,它使用PDBFixer来纠正蛋白质PDB文件中的各种常见错误。此外,PDBFixer还能够为蛋白质分配依赖于pH值的质子化状态。

警告 1: 当使用 fix_protein() 和 AutoDock Tools 的 prepare_receptor 或者运行 LeDock 时,可能会遇到蛋白质修复的问题。为了解决这个问题,最好在 prepare_receptor 中设置参数 -A hydrogens。

建议: PDBFixer 是一个很好的解决方案,因为它可以解决PDB文件中的一些错误问题。因此,PDBFixer 重新对残基进行编号,从1开始,而不管原始PDB文件的编号是什么。为了解决这个问题,fix_protein() 函数包含一个协议,可以根据原始PDB文件来原子地重新编号残基。

fix_protein ( params )

Params:

  • filename: str or path-like ; input file containing protein struture to be modified, file extrension must be pdb

  • addHs_pH: float ; Add hydrogens at user defined pH

  • try_renumberResidues: bool ; By default PDBFixer renumarets residues starting in 1, this option tries to recover originar residues numbering

  • output: str or path-like ; output filename, extension must be pdb

代码
文本
fix_protein(filename='1AZ8_clean.pdb',addHs_pH=7.4,try_renumberResidues=True,output='1AZ8_clean_H.pdb')
代码
文本

2.2. 配体预处理

由于配体的多样性和格式的变化,配体的预处理和准备可能是最困难的任务之一。例如,为配体设置质子化状态可能很困难。强烈建议用户在使用Jupyter Dock或任何其他分子对接方法之前了解并理解自己配体的正确状态。

在这个例子中,在使用Pymol获取配体和蛋白质后进行分划分后,配体可能存在键级错误的问题(在Chem.MolFromMol2File中使用sanitize=False)

代码
文本
[11]
m=Chem.MolFromMol2File('1AZ8_lig.mol2',sanitize=False)
Draw.MolToImage(m)
[13:07:07] 1AZ8: Warning - no explicit hydrogens in mol2 file but needed for formal charge estimation.
代码
文本

提示: 对于简单的预处理问题,一个解决方案是使用OpenBabel将分子转换并为分子对接添加必要的氢原子(Pybel)。在OpenBabel中,进行分子操作的定义与RDKIt中的定义不同。因此,OpenBabel能够处理这种转换

代码
文本
[12]
mol= [m for m in pybel.readfile(filename='1AZ8_lig.mol2',format='mol2')][0]
mol.addh()
out=pybel.Outputfile(filename='1AZ8_lig_H.mol2',format='mol2',overwrite=True)
out.write(mol)
out.close()
==============================
*** Open Babel Warning  in ReadMolecule
  Failed to kekulize aromatic bonds in MOL2 file (title is 1AZ8)

代码
文本

经过配体预处理后,RDKit可以显示出一个新的分子,而无需使用sanitize参数。此外,该示例的输出结构与PDB数据库(PDB 1AZ8)中报告的结构完全相符。

代码
文本
[13]
m=Chem.MolFromMol2File('1AZ8_lig_H.mol2')
m
代码
文本

3. 体系可视化

得益于强大的py3Dmol工具,Jupyter Dock的一个很酷的功能是可以在notebook中可视化配体-蛋白质复合物和分子对接结果。

现在蛋白质和配体已经被预处理,建议对配体-蛋白质参考系统进行可视化检查。

代码
文本
[16]
view = py3Dmol.view()
view.removeAllModels()
view.setViewStyle({'style':'outline','color':'black','width':0.1})

view.addModel(open('1AZ8_clean_H.pdb','r').read(),format='pdb')
Prot=view.getModel()
Prot.setStyle({'cartoon':{'arrows':True, 'tubes':True, 'style':'oval', 'color':'white'}})
view.addSurface(py3Dmol.VDW,{'opacity':0.6,'color':'white'})


view.addModel(open('1AZ8_lig_H.mol2','r').read(),format='mol2')
ref_m = view.getModel()
ref_m.setStyle({},{'stick':{'colorscheme':'greenCarbon','radius':0.2}})

view.zoomTo()
view.show()
代码
文本

**提示:**可以用鼠标拖动整个结构,并用滚轮放大或者缩小。

代码
文本

4. 通过Smina对接

尽管AutoDock Vina 1.2.0中存在诸多限制,但其他集成了AutoDock Vina的工具允许使用诸如自定义评分函数(smina)、快速执行(qvina)和使用更大搜索空间(qvina-w)等很酷的功能。Jupyter Dock可以在notebook中运行这些二进制文件,为用户提供更多选择。

Smina是AutoDock Vina的一个分支,专门定制用于支持评分函数开发和高性能能量最小化。Smina由匹兹堡大学的David Koes维护,与AutoDock项目没有直接关联。

提示: 下面的单元格包含了使用Smina运行当前分子对接示例的示例。然而,qvina和qvina-w的可执行文件可以在Jupyter Dock存储库的bin目录中找到。因此,用户可以通过添加必要的单元格或替换当前的分子对接引擎来使用这一类的工具。

4.1. 受体准备

尽管Smina是AutoDock Vina的修改版本,但在Smina中,受体的输入文件可以是一个包含所有残基的PDBQT文件或带有所有残基明确氢原子的PDB文件。在这一点上,我们可以利用预处理步骤以及Jupyter Dock的fix_protein()函数提供的蛋白质结构。

4.2. 配体准备

在Smina中,我们可以使用任何OpenBabel支持的格式作为配体输入和对接结果的格式,就像我们对受体准备时所用的一样。因此,预处理之后,我们可以在这里使用配体的MOL2文件。

4.3. 定义对接格点

我们需要定义对接格点中心的坐标,为center(x,y,z),以及x、y、z三个坐标轴上的大小。单位:angstrom

4.4. 对接

Jupyter Dock附带了适用于Linux和Mac OS的smina可执行文件。通过运行二进制文件,可以访问参数设置。

代码
文本
[33]
!../../bin/smina #Lauch this cell to see parameters
Missing receptor.

Correct usage:

Input:
  -r [ --receptor ] arg         rigid part of the receptor (PDBQT)
  --flex arg                    flexible side chains, if any (PDBQT)
  -l [ --ligand ] arg           ligand(s)
  --flexres arg                 flexible side chains specified by comma 
                                separated list of chain:resid or 
                                chain:resid:icode
  --flexdist_ligand arg         Ligand to use for flexdist
  --flexdist arg                set all side chains within specified distance 
                                to flexdist_ligand to flexible

Search space (required):
  --center_x arg                X coordinate of the center
  --center_y arg                Y coordinate of the center
  --center_z arg                Z coordinate of the center
  --size_x arg                  size in the X dimension (Angstroms)
  --size_y arg                  size in the Y dimension (Angstroms)
  --size_z arg                  size in the Z dimension (Angstroms)
  --autobox_ligand arg          Ligand to use for autobox
  --autobox_add arg             Amount of buffer space to add to auto-generated
                                box (default +4 on all six sides)
  --no_lig                      no ligand; for sampling/minimizing flexible 
                                residues

Scoring and minimization options:
  --scoring arg                 specify alternative builtin scoring function
  --custom_scoring arg          custom scoring function file
  --custom_atoms arg            custom atom type parameters file
  --score_only                  score provided ligand pose
  --local_only                  local search only using autobox (you probably 
                                want to use --minimize)
  --minimize                    energy minimization
  --randomize_only              generate random poses, attempting to avoid 
                                clashes
  --minimize_iters arg (=0)     number iterations of steepest descent; default 
                                scales with rotors and usually isn't sufficient
                                for convergence
  --accurate_line               use accurate line search
  --minimize_early_term         Stop minimization before convergence conditions
                                are fully met.
  --approximation arg           approximation (linear, spline, or exact) to use
  --factor arg                  approximation factor: higher results in a 
                                finer-grained approximation
  --force_cap arg               max allowed force; lower values more gently 
                                minimize clashing structures
  --user_grid arg               Autodock map file for user grid data based 
                                calculations
  --user_grid_lambda arg (=-1)  Scales user_grid and functional scoring
  --print_terms                 Print all available terms with default 
                                parameterizations
  --print_atom_types            Print all available atom types

Output (optional):
  -o [ --out ] arg              output file name, format taken from file 
                                extension
  --out_flex arg                output file for flexible receptor residues
  --log arg                     optionally, write log file
  --atom_terms arg              optionally write per-atom interaction term 
                                values
  --atom_term_data              embedded per-atom interaction terms in output 
                                sd data

Misc (optional):
  --cpu arg                     the number of CPUs to use (the default is to 
                                try to detect the number of CPUs or, failing 
                                that, use 1)
  --seed arg                    explicit random seed
  --exhaustiveness arg (=8)     exhaustiveness of the global search (roughly 
                                proportional to time)
  --num_modes arg (=9)          maximum number of binding modes to generate
  --energy_range arg (=3)       maximum energy difference between the best 
                                binding mode and the worst one displayed 
                                (kcal/mol)
  --min_rmsd_filter arg (=1)    rmsd value used to filter final poses to remove
                                redundancy
  -q [ --quiet ]                Suppress output messages
  --addH arg                    automatically add hydrogens in ligands (on by 
                                default)

Configuration file (optional):
  --config arg                  the above options can be put here

Information (optional):
  --help                        display usage summary
  --help_hidden                 display usage summary with hidden options
  --version                     display program version

代码
文本
[34]
!../../bin/smina -r '1AZ8_clean_H.pdb' -l '1AZ8_lig_H.mol2' -o '1AZ8_lig_smina_out.sdf' --center_x 31.859 --center_y 13.34 --center_z 17.065 --size_x 24.569 --size_y 18.12 --size_z 17.37 --exhaustiveness 8 --num_modes 10
   _______  _______ _________ _        _______ 
  (  ____ \(       )\__   __/( (    /|(  ___  )
  | (    \/| () () |   ) (   |  \  ( || (   ) |
  | (_____ | || || |   | |   |   \ | || (___) |
  (_____  )| |(_)| |   | |   | (\ \) ||  ___  |
        ) || |   | |   | |   | | \   || (   ) |
  /\____) || )   ( |___) (___| )  \  || )   ( |
  \_______)|/     \|\_______/|/    )_)|/     \|


smina is based off AutoDock Vina. Please cite appropriately.

Weights      Terms
-0.035579    gauss(o=0,_w=0.5,_c=8)
-0.005156    gauss(o=3,_w=2,_c=8)
0.840245     repulsion(o=0,_c=8)
-0.035069    hydrophobic(g=0.5,_b=1.5,_c=8)
-0.587439    non_dir_h_bond(g=-0.7,_b=0,_c=8)
1.923        num_tors_div

Using random seed: -1182802522

0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************

mode |   affinity | dist from best mode
     | (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1       -8.2       0.000      0.000    
2       -7.7       4.056      7.507    
3       -7.6       4.178      7.794    
4       -7.6       1.210      1.700    
5       -7.5       4.827      8.301    
6       -7.2       3.678      6.044    
7       -7.1       4.482      7.346    
8       -7.1       5.201      9.050    
9       -7.1       4.048      7.380    
10      -7.0       4.660      7.053    
Refine time 77.740
Loop time 78.576
代码
文本

4.5. 对接结果的3D展示

与系统可视化(第3部分)类似,可以检查并比较分子对接结果与参考结构(如果存在)之间的差异。Smina将对接得分作为分子的属性保存,其中包括了"minimizedAffinity"信息。

代码
文本
[35]
view = py3Dmol.view()
view.removeAllModels()
view.setViewStyle({'style':'outline','color':'black','width':0.1})

view.addModel(open('1AZ8_clean_H.pdb','r').read(),format='pdb')
Prot=view.getModel()
Prot.setStyle({'cartoon':{'arrows':True, 'tubes':True, 'style':'oval', 'color':'white'}})
view.addSurface(py3Dmol.VDW,{'opacity':0.6,'color':'white'})


view.addModel(open('1AZ8_lig_H.mol2','r').read(),format='mol2')
ref_m = view.getModel()
ref_m.setStyle({},{'stick':{'colorscheme':'magentaCarbon','radius':0.2}})


results=Chem.SDMolSupplier('1AZ8_lig_smina_out.sdf')

p=Chem.MolToMolBlock(results[0],False)

print('Reference: Magenta | Smina Pose: Cyan')
print ('Score: {}'.format(results[0].GetProp('minimizedAffinity')))

view.addModel(p,'mol')
x = view.getModel()
x.setStyle({},{'stick':{'colorscheme':'cyanCarbon','radius':0.2}})

view.zoomTo()
view.show()
Reference: Magenta | Smina Pose: Cyan
Score: -8.15442
代码
文本

5.7. 2D 相互作用图表

在分子对接实验之后,检查分子间相互作用是计算科学家进行的最常见分析之一。因此,Jupyter Dock使用ProLif来创建配体-蛋白质分子间相互作用表以及相应的二维图。

提示: ProLif使用RDKit和MDAnalysis来映射配体和蛋白质之间的分子相互作用。因此,某些蛋白质准备过程,包括使用LePro,可能会导致错误,影响分析过程。可以使用Jupyter Dock的fix_protein()函数来避免这些错误,并为分析提供适用的蛋白质结构。

代码
文本
[36]
# load protein
prot = mda.Universe("1AZ8_clean_H_fix.pdb")
prot = plf.Molecule.from_mda(prot)
prot.n_residues

# load ligands
lig_suppl = list(plf.sdf_supplier('1AZ8_lig_smina_out.sdf'))
# generate fingerprint
fp = plf.Fingerprint()
fp.run_from_iterable(lig_suppl, prot)
results_df = fp.to_dataframe(return_atoms=True)
results_df
ligand UNL1
protein ASP189.A GLY148.A GLY216.A GLY219.A HIS57.A SER190.A SER195.A THR149.A
interaction HBDonor VdWContact VdWContact VdWContact HBDonor VdWContact HBDonor VdWContact VdWContact VdWContact VdWContact
Frame
0 (None, None) (8, 0) (22, 0) (None, None) (12, 0) (8, 0) (None, None) (None, None) (1, 0) (None, None) (None, None)
1 (None, None) (8, 0) (None, None) (None, None) (12, 0) (8, 0) (None, None) (None, None) (1, 0) (None, None) (None, None)
2 (None, None) (None, None) (None, None) (None, None) (12, 0) (8, 0) (None, None) (None, None) (7, 0) (None, None) (None, None)
3 (11, 0) (7, 0) (None, None) (None, None) (10, 0) (7, 0) (None, None) (None, None) (8, 0) (None, None) (21, 0)
4 (None, None) (None, None) (None, None) (None, None) (None, None) (None, None) (None, None) (None, None) (7, 0) (None, None) (None, None)
5 (None, None) (8, 0) (None, None) (None, None) (12, 0) (8, 0) (26, 0) (19, 0) (7, 0) (None, None) (None, None)
6 (None, None) (22, 0) (None, None) (None, None) (26, 0) (22, 0) (None, None) (None, None) (21, 0) (None, None) (None, None)
7 (None, None) (None, None) (None, None) (None, None) (None, None) (None, None) (None, None) (None, None) (None, None) (7, 0) (None, None)
8 (None, None) (22, 0) (None, None) (None, None) (26, 0) (22, 0) (None, None) (None, None) (21, 0) (None, None) (None, None)
9 (None, None) (22, 0) (None, None) (7, 0) (26, 0) (22, 0) (None, None) (None, None) (21, 0) (None, None) (None, None)
代码
文本
[37]
net = LigNetwork.from_ifp(results_df,lig_suppl[0],kind="frame", frame=0,rotation=270)
net.display()
代码
文本

6. 使用 Ledock 对接

LeDock是专为将小分子灵活地对接到蛋白质上而设计的。它在Astex多样性数据集上的位姿预测准确率超过90%,对于类药物分子每次运行大约需要3秒的时间。LeDock在高通量虚拟筛选活动中发现了新型激酶抑制剂和溴域拮抗剂。它直接使用SYBYL Mol2格式作为小分子的输入。

6.1. 受体准备

在LeDock中,受体的输入文件是一个带有所有残基明确氢原子的PDB文件。LePro是为了将蛋白质结构准备用于与LeDock进行对接而创建的工具。因此,在这个阶段,我们可以使用经过预处理步骤的文件以及Jupyter Dock的fix_protein()函数提供的蛋白质结构。

6.2. 配体准备

如前所述,LeDock中的配体输入格式是MOL2。与受体准备类似,我们可以在预处理后直接使用配体。

6.3. 对接格点定义

这一步可以以与AutoDock Vina定义搜索空间相同的方式完成。为了在LeDock格式中获得与AutoDock Vina对接中相同的搜索空间,用户只需要将参数"software"从"vina"更改为"ledock"。

提示: get_box() 函数的实现使得在 AutoDock Vina 和 LeDock 之间轻松复制结合位点成为可能,从而实现了在这两个程序之间的通用性。

代码
文本
[38]
cmd.load(filename='1AZ8_clean_H.pdb',format='pdb',object='prot')
cmd.load(filename='1AZ8_lig_H.mol2',format='mol2',object='lig')

X,Y,Z= getbox(selection='lig',extending=5.0,software='ledock')
cmd.delete('all')

print(X,'\n',Y,'\n',Z)
{'minX': 19.57430076599121, 'maxX': 44.143798828125} 
 {'minY': 4.285799980163574, 'maxY': 22.409099578857422} 
 {'minZ': 8.378700256347656, 'maxZ': 25.75309944152832}
代码
文本

6.4. 对接

要运行LeDock,需要一个配置文件(通常称为dock.in),其中包含所有的对接参数以及有关待对接的受体和配体的信息。Jupyter Dock使用generate_ledock_file()函数来自动生成配置文件。在配置参数后,用户将收到一个配置文件以及一个包含待对接配体列表的文件(通常命名为ligand.list)。完成这些步骤后,对接过程就只需简单地启动LeDock可执行文件并将配置文件作为参数即可。

generate_ledock_file( params )

params:

  • receptor: str or path-like string ; protein file for docking including hydrogens, format must be pdb
  • x: 2 element list of floats [ float , float ]; Xmin and Xmax coordinates of docking box
  • y: 2 element list of floats [ float , float ]; Ymin and Ymax coordinates of docking box
  • z: 2 element list of floats [ float , float ]; Zmin and Zmax coordinates of docking box
  • n_poses: float ; n_of poses to retrieve from docking
  • rmsd: float ; minimum RMSD diference between docking poses
  • l_list: _list of n strings or path-like strings [ lig1, lig2, lig3 ... ] ; list of ligands or ligands paths to dock
  • l_list_outfile: str or path-like string ; filename to save the ligand list, needed for ledock to locate ligands
  • out: str or path-like string ; outfile to save docking paramemeters, needed to launch the docking
代码
文本
[39]
generate_ledock_file(receptor='1AZ8_clean_H.pdb',x=[X['minX'],X['maxX']],
y=[Y['minY'],Y['maxY']],
z=[Z['minZ'],Z['maxZ']],
n_poses=10,
rmsd=1.0,
l_list='1AZ8_lig_H.mol2',
l_list_outfile='ledock_ligand.list',
out='dock.in')
代码
文本
[40]
!../../bin/ledock_linux_x86 #Launch this cell to see parameters
************************************************************
*       LeDock v1.0                                        *
*       Molecular Docking Software                         *
*       Copyright 2013-21 (C) H. Zhao PhD                  *
*       For academic use only                              *
*       www.lephar.com                                     *
************************************************************
--------Usage:
--------ledock config.file    !docking
--------ledock -spli dok.file !split into separate coordinates

代码
文本
[44]
!../../bin/ledock_linux_x86 dock.in
代码
文本

6.5. 将对接结果 DOK 文件转换为 sdf 文件

LeDock的最终结果是一个拓展名为.dok的文件,其中包含了与pdb文件类似的对接属性。然而,.dok文件不是广泛用于表示化学结构的格式。因此,Jupyter Dock可以将.dok文件转换为广泛使用的sdf格式。与pdbqt到sdf的转换类似,Jupyter Dock将保留化学特征,并将"Pose"和"Score"结果保存为分子属性。

dok_to_sdf ( params )

params:

  • dok_file: str or path-like ; dok file from ledock docking

  • output: str or path-like ; out file from ledock docking, extension must be sdf

代码
文本
[46]
dok_to_sdf(dok_file='1AZ8_lig_H.dok',output='1AZ8_lig_ledock_out.sdf')
代码
文本

6.6. 对接结果的 3D 展示

与系统可视化(第3节)类似,可以检查并将对接结果与参考结构进行比较(如果存在)。还将显示配体的“Pose”和“Score”信息,以展示如何访问该分子的属性。

代码
文本
[47]
view = py3Dmol.view()
view.removeAllModels()
view.setViewStyle({'style':'outline','color':'black','width':0.1})

view.addModel(open('1AZ8_clean_H.pdb','r').read(),format='pdb')
Prot=view.getModel()
Prot.setStyle({'cartoon':{'arrows':True, 'tubes':True, 'style':'oval', 'color':'white'}})
view.addSurface(py3Dmol.VDW,{'opacity':0.6,'color':'white'})


view.addModel(open('1AZ8_lig_H.mol2','r').read(),format='mol2')
ref_m = view.getModel()
ref_m.setStyle({},{'stick':{'colorscheme':'magentaCarbon','radius':0.2}})

results=Chem.SDMolSupplier('1AZ8_lig_ledock_out.sdf')


p=Chem.MolToMolBlock(results[0])

print('Reference: Magenta | LeDock Pose: Cyan')
print ('Pose: {} | Score: {}'.format(results[0].GetProp('Pose'),results[0].GetProp('Score')))

view.addModel(p,'mol')
x = view.getModel()
x.setStyle({},{'stick':{'colorscheme':'cyanCarbon','radius':0.2}})

view.zoomTo()
view.show()
Reference: Magenta | LeDock Pose: Cyan
Pose: 1 | Score: -8.89
代码
文本

6.7. 2D 相互作用图表

在对接实验后,检查分子间相互作用是计算科学家进行的最常见分析之一。因此,Jupyter Dock使用ProLif来创建配体-蛋白质分子相互作用的表格以及相应的二维图。

提示: ProLif使用RDKit和MDAnalysis来映射配体和蛋白质之间的分子相互作用。因此,一些蛋白质的预处理,包括使用LePro,可能会导致错误,阻碍分析。可以使用Jupyter Dock的_fix_protein()_函数来避免这些错误,并为分析提供合适的蛋白质结构。

代码
文本
[48]
# load protein
prot = mda.Universe("1AZ8_clean_H_fix.pdb",guess_bonds=True)
prot = plf.Molecule.from_mda(prot)
prot.n_residues

# load ligands
path = str('1AZ8_lig_ledock_out.sdf')
lig_suppl = list(plf.sdf_supplier(path))
# generate fingerprint
fp = plf.Fingerprint()
fp.run_from_iterable(lig_suppl, prot)
results_df = fp.to_dataframe(return_atoms=True)
results_df
ligand UNL1
protein ASN143.A ASP189.A CYS191.A CYS220.A GLN192.A ... SER195.A SER214.A THR149.A TRP215.A VAL213.A
interaction HBAcceptor VdWContact HBDonor VdWContact HBDonor VdWContact Hydrophobic VdWContact Hydrophobic VdWContact ... HBDonor VdWContact HBDonor VdWContact HBAcceptor HBDonor VdWContact VdWContact Hydrophobic VdWContact
Frame
0 (3, 11) (3, 11) (32, 11) (5, 11) (None, None) (None, None) (None, None) (5, 3) (14, 6) (23, 8) ... (None, None) (None, None) (None, None) (None, None) (None, None) (None, None) (None, None) (6, 8) (0, 8) (None, None)
1 (None, None) (1, 10) (32, 11) (5, 11) (None, None) (None, None) (None, None) (5, 3) (12, 6) (12, 8) ... (None, None) (None, None) (None, None) (None, None) (None, None) (None, None) (1, 12) (None, None) (0, 8) (None, None)
2 (None, None) (3, 11) (None, None) (5, 9) (None, None) (None, None) (None, None) (40, 9) (12, 6) (12, 8) ... (None, None) (29, 10) (None, None) (None, None) (None, None) (None, None) (None, None) (6, 2) (0, 8) (None, None)
3 (None, None) (1, 11) (33, 11) (5, 11) (None, None) (None, None) (None, None) (5, 3) (14, 6) (23, 8) ... (None, None) (None, None) (None, None) (37, 5) (None, None) (28, 12) (28, 12) (8, 5) (0, 8) (None, None)
4 (None, None) (21, 11) (32, 11) (5, 11) (None, None) (0, 3) (None, None) (5, 3) (16, 6) (23, 6) ... (None, None) (None, None) (None, None) (None, None) (None, None) (None, None) (None, None) (None, None) (None, None) (None, None)
5 (3, 11) (1, 11) (32, 11) (5, 11) (None, None) (None, None) (None, None) (23, 9) (12, 6) (14, 8) ... (None, None) (None, None) (None, None) (None, None) (3, 13) (None, None) (1, 12) (None, None) (0, 8) (None, None)
6 (None, None) (3, 10) (None, None) (None, None) (None, None) (5, 4) (None, None) (40, 9) (14, 6) (23, 8) ... (32, 0) (5, 9) (None, None) (8, 4) (None, None) (None, None) (None, None) (13, 15) (0, 8) (5, 8)
7 (None, None) (1, 10) (33, 11) (8, 10) (None, None) (11, 3) (None, None) (40, 9) (12, 6) (12, 8) ... (None, None) (None, None) (None, None) (None, None) (None, None) (None, None) (None, None) (39, 8) (0, 8) (29, 10)
8 (None, None) (1, 10) (None, None) (None, None) (34, 5) (8, 5) (None, None) (12, 9) (14, 6) (14, 1) ... (33, 9) (5, 9) (32, 5) (5, 5) (None, None) (None, None) (1, 12) (37, 8) (0, 8) (8, 10)
9 (None, None) (None, None) (None, None) (None, None) (None, None) (None, None) (23, 6) (5, 3) (2, 6) (12, 8) ... (None, None) (None, None) (None, None) (None, None) (None, None) (None, None) (None, None) (8, 4) (None, None) (None, None)

10 rows × 36 columns

代码
文本
[49]
net = LigNetwork.from_ifp(results_df,lig_suppl[0],kind="frame", frame=0,rotation=270)
net.display()
代码
文本

参考

  1. https://github.com/AngelRuizMoreno/Jupyter_Dock

如果你使用这个notebook中的工具,记得引用相关文献

1. Jupyter Dock

Ruiz-Moreno A.J. Jupyter Dock: Molecular Docking integrated in Jupyter Notebooks. https://doi.org/10.5281/zenodo.5514956

2. Autodock Vina

Eberhardt, J., Santos-Martins, D., Tillack, A.F., Forli, S. (2021). AutoDock Vina 1.2.0: New Docking Methods, Expanded Force Field, and Python Bindings. Journal of Chemical Information and Modeling.

Trott, O., & Olson, A. J. (2010). AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. Journal of computational chemistry, 31(2), 455-461.

3. LeDock

Wang Z, Sun H, Yao X, Li D, Xu L, Li Y, et al. Comprehensive evaluation of ten docking programs on a diverse set of protein–ligand complexes: the prediction accuracy of sampling power and scoring power. Phys Chem Chem Phys. 2016;18: 12964–12975. https://doi.org/10.1039/C6CP01555G.

4. AutoDock Tools

Morris, G. M., Huey, R., Lindstrom, W., Sanner, M. F., Belew, R. K., Goodsell, D. S., & Olson, A. J. (2009). AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. Journal of computational chemistry, 30(16), 2785–2791. https://doi.org/10.1002/jcc.21256.

5. Pymol API

The PyMOL Molecular Graphics System, Version 1.2r3pre, Schrödinger, LLC.

6. OpenBabel

O'Boyle, N.M., Banck, M., James, C.A. et al. Open Babel: An open chemical toolbox. J Cheminform 3, 33 (2011). https://doi.org/10.1186/1758-2946-3-33.

7. RDKit

RDKit: Open-source cheminformatics; http://www.rdkit.org

8. py3Dmol

Keshavan Seshadri, Peng Liu, and David Ryan Koes. Journal of Chemical Education 2020 97 (10), 3872-3876. https://doi.org/10.1021/acs.jchemed.0c00579.

9. PDBFixer

P. Eastman, M. S. Friedrichs, J. D. Chodera, R. J. Radmer, C. M. Bruns, J. P. Ku, K. A. Beauchamp, T. J. Lane, L.-P. Wang, D. Shukla, T. Tye, M. Houston, T. Stich, C. Klein, M. R. Shirts, and V. S. Pande. 2013. OpenMM 4: A Reusable, Extensible, Hardware Independent Library for High Performance Molecular Simulation. Journal of Chemical Theory and Computation. ACS Publications. 9(1): 461-469.

10. MDAnalysis

R. J. Gowers, M. Linke, J. Barnoud, T. J. E. Reddy, M. N. Melo, S. L. Seyler, D. L. Dotson, J. Domanski, S. Buchoux, I. M. Kenney, and O. Beckstein. MDAnalysis: A Python package for the rapid analysis of molecular dynamics simulations. In S. Benthall and S. Rostrup, editors, Proceedings of the 15th Python in Science Conference, pages 98-105, Austin, TX, 2016. SciPy, doi:10.25080/majora-629e541a-00e.

N. Michaud-Agrawal, E. J. Denning, T. B. Woolf, and O. Beckstein. MDAnalysis: A Toolkit for the Analysis of Molecular Dynamics Simulations. J. Comput. Chem. 32 (2011), 2319-2327, doi:10.1002/jcc.21787. PMCID:PMC3144279.

11. ProLif

chemosim-lab/ProLIF: v0.3.3 - 2021-06-11.https://doi.org/10.5281/zenodo.4386984.

12. Fpocket

Le Guilloux, V., Schmidtke, P. & Tuffery, P. Fpocket: An open source platform for ligand pocket detection. BMC Bioinformatics 10, 168 (2009). https://doi.org/10.1186/1471-2105-10-168.

13. Smina

Koes, D. R., Baumgartner, M. P., & Camacho, C. J. (2013). Lessons learned in empirical scoring with smina from the CSAR 2011 benchmarking exercise. Journal of chemical information and modeling, 53(8), 1893–1904. https://doi.org/10.1021/ci300604z
代码
文本
化学信息学
Jupyter_Dock
分子对接
化学信息学Jupyter_Dock分子对接
已赞1
本文被以下合集收录
分子对接
18895326035@163.com
更新于 2024-05-21
6 篇0 人关注
分子对接
小炒砂糖桔
更新于 2023-12-18
6 篇0 人关注
推荐阅读
公开
Jupyter_Dock | 盲对接
化学信息学Jupyter_Dock分子对接
化学信息学Jupyter_Dock分子对接
liyq@dp.tech
发布于 2023-06-15
4 转存文件
公开
Jupyter_Dock | 反向对接
化学信息学Jupyter_Dock分子对接
化学信息学Jupyter_Dock分子对接
liyq@dp.tech
发布于 2023-06-15
3 转存文件