空间站广场

论文

Notebooks

比赛

课程

Apps

我的主页

我的Notebooks

我的论文库

我的足迹

我的工作空间

任务

节点

文件

数据集

镜像

项目

数据库

公开

分子对接程序介绍｜ GNINA

分子对接

YangHe

发布于 2023-09-24

推荐镜像 :Basic Image:bohrium-notebook:2023-04-07

推荐机型 :c16_m62_1 * NVIDIA T4

背景

什么是分子对接

分子对接的局限性

代表性开源软件

AutoDock Vina

smina

GNINA

安装软件

GNINA

py3Dmol

openbabel

使用GNINA进行分子对接

GNINA的基本使用

受体、配体准备

从PDB数据库下载案例

分割复合物结构，得到受体和配体文件

使用py3Dmol可视化观察复合物构象

进行对接

可视化对接结果

计算预测结果与真实结合构象的RMSD数值

背景

什么是分子对接

预测化合物分子在受体结合口袋中的最有可能构象

主要包括两个步骤：

采样构象空间
打分化合物的构象
- 理想的打分应该和受体-配体亲和力等同，或者可以被有效地用于化合物的亲和力排序
- 打分数值 $\neq =$ 结合自由能

代码

文本

分子对接的局限性

在应用场景中，分子对接被期望是高通量的，为了达到这个目的，分子对接采取了一些近似操作

受体一般保持为刚性或者半柔性
化合物柔性通常通过可旋转键（torsions）实现
不含显式溶剂

代码

文本

代表性开源软件

AutoDock Vina

作者是Scripps研究所Oleg Trott博士

Apache License

于2009首次发表, 2021发布1.2.0版本

smina

是Vina的衍生版本，变得更加易用，支持多种输入格式，支持自定义打分函数。

Apache/GPL2 License

GNINA 继承了smina的所有功能.

GNINA

smina的衍生版本，使用CNN对受体-配体结合构象进行打分。

代码

文本

安装软件

GNINA

代码

文本

[1]

!wget https://github.com/gnina/gnina/releases/download/v1.0.3/gnina

!chmod +x ./gnina #make executable

--2023-09-24 22:15:51--  https://github.com/gnina/gnina/releases/download/v1.0.3/gnina
Resolving ga.dp.tech (ga.dp.tech)... 10.255.254.18, 10.255.254.37, 10.255.254.7
Connecting to ga.dp.tech (ga.dp.tech)|10.255.254.18|:8118... connected.
Proxy request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/45548146/6841601b-545d-4c9f-b4b1-f0eaa8ec7b74?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20230924%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20230924T141552Z&X-Amz-Expires=300&X-Amz-Signature=3ca8faf7043ee1f74d4be57887441383edbeddaa4762ca6804380daefab83f32&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=45548146&response-content-disposition=attachment%3B%20filename%3Dgnina&response-content-type=application%2Foctet-stream [following]
--2023-09-24 22:15:52--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/45548146/6841601b-545d-4c9f-b4b1-f0eaa8ec7b74?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20230924%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20230924T141552Z&X-Amz-Expires=300&X-Amz-Signature=3ca8faf7043ee1f74d4be57887441383edbeddaa4762ca6804380daefab83f32&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=45548146&response-content-disposition=attachment%3B%20filename%3Dgnina&response-content-type=application%2Foctet-stream
Connecting to ga.dp.tech (ga.dp.tech)|10.255.254.18|:8118... connected.
Proxy request sent, awaiting response... 200 OK
Length: 305664576 (292M) [application/octet-stream]
Saving to: ‘gnina.1’

gnina.1             100%[===================>] 291.50M  4.51MB/s    in 66s     

2023-09-24 22:16:58 (4.44 MB/s) - ‘gnina.1’ saved [305664576/305664576]

代码

文本

py3Dmol

py3Dmol是一款可视化工具

代码

文本

[2]

!pip install py3Dmol

Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting py3Dmol
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/b5/3d/052e5932ef95624e118b886feb58a9c60595e89da74515604933b6b0e6a5/py3Dmol-2.0.4-py2.py3-none-any.whl (12 kB)
Installing collected packages: py3Dmol
Successfully installed py3Dmol-2.0.4
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

代码

文本

openbabel

openbabel是一款强大的化学信息学软件，在这里可被用于对接结果的分析；
此外，openbabel是GNINA中受体、配体结构的前处理工具，这是GNINA能支持多种输入格式的原因。

代码

文本

[3]

!apt-get update

!apt-get install openbabel -y

Get:1 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB]      
Get:2 https://deb.nodesource.com/node_18.x focal InRelease [4583 B]            
Get:3 http://security.ubuntu.com/ubuntu focal-security/universe amd64 Packages [1110 kB]
Hit:4 http://archive.ubuntu.com/ubuntu focal InRelease                         
Get:5 http://archive.ubuntu.com/ubuntu focal-updates InRelease [114 kB]        
Get:6 http://security.ubuntu.com/ubuntu focal-security/multiverse amd64 Packages [29.3 kB]
Get:7 http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages [3065 kB]
Get:8 http://archive.ubuntu.com/ubuntu focal-backports InRelease [108 kB]      
Get:9 http://security.ubuntu.com/ubuntu focal-security/restricted amd64 Packages [2823 kB]
Get:10 https://deb.nodesource.com/node_18.x focal/main amd64 Packages [776 B]  
Get:11 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 Packages [1414 kB]
Get:12 http://archive.ubuntu.com/ubuntu focal-updates/multiverse amd64 Packages [32.0 kB]
Get:13 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages [3552 kB]
Get:14 http://archive.ubuntu.com/ubuntu focal-updates/restricted amd64 Packages [2974 kB]
Get:15 http://archive.ubuntu.com/ubuntu focal-backports/universe amd64 Packages [28.6 kB]
Get:16 http://archive.ubuntu.com/ubuntu focal-backports/main amd64 Packages [55.2 kB]
Get:17 https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64  InRelease [1581 B]
Get:18 https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu2004/x86_64  Packages [1168 kB]
Fetched 16.6 MB in 13s (1242 kB/s)  
Reading package lists... Done
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  libboost-iostreams1.71.0 libopenbabel6 libschroedinger-maeparser1
The following NEW packages will be installed:
  libboost-iostreams1.71.0 libopenbabel6 libschroedinger-maeparser1 openbabel
0 upgraded, 4 newly installed, 0 to remove and 145 not upgraded.
Need to get 4021 kB of archives.
After this operation, 20.0 MB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu focal/main amd64 libboost-iostreams1.71.0 amd64 1.71.0-6ubuntu6 [237 kB]
Get:2 http://archive.ubuntu.com/ubuntu focal/universe amd64 libschroedinger-maeparser1 amd64 1.2.2-1build1 [89.1 kB]
Get:3 http://archive.ubuntu.com/ubuntu focal/universe amd64 libopenbabel6 amd64 3.0.0+dfsg-3ubuntu3 [3568 kB]
Get:4 http://archive.ubuntu.com/ubuntu focal/universe amd64 openbabel amd64 3.0.0+dfsg-3ubuntu3 [127 kB]
Fetched 4021 kB in 16s (251 kB/s)                                              
Selecting previously unselected package libboost-iostreams1.71.0:amd64.
(Reading database ... 63384 files and directories currently installed.)
Preparing to unpack .../libboost-iostreams1.71.0_1.71.0-6ubuntu6_amd64.deb ...
Unpacking libboost-iostreams1.71.0:amd64 (1.71.0-6ubuntu6) ...
Selecting previously unselected package libschroedinger-maeparser1:amd64.
Preparing to unpack .../libschroedinger-maeparser1_1.2.2-1build1_amd64.deb ...
Unpacking libschroedinger-maeparser1:amd64 (1.2.2-1build1) ...
Selecting previously unselected package libopenbabel6.
Preparing to unpack .../libopenbabel6_3.0.0+dfsg-3ubuntu3_amd64.deb ...
Unpacking libopenbabel6 (3.0.0+dfsg-3ubuntu3) ...
Selecting previously unselected package openbabel.
Preparing to unpack .../openbabel_3.0.0+dfsg-3ubuntu3_amd64.deb ...
Unpacking openbabel (3.0.0+dfsg-3ubuntu3) ...
Setting up libboost-iostreams1.71.0:amd64 (1.71.0-6ubuntu6) ...
Setting up libschroedinger-maeparser1:amd64 (1.2.2-1build1) ...
Setting up libopenbabel6 (3.0.0+dfsg-3ubuntu3) ...
Setting up openbabel (3.0.0+dfsg-3ubuntu3) ...
Processing triggers for libc-bin (2.31-0ubuntu9.9) ...
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-ml.so.470.82.01 is empty, not checked.
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.470.82.01 is empty, not checked.
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libcuda.so.470.82.01 is empty, not checked.
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-cfg.so.470.82.01 is empty, not checked.
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-allocator.so.470.82.01 is empty, not checked.
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-compiler.so.470.82.01 is empty, not checked.
/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-opencl.so.470.82.01 is empty, not checked.
Processing triggers for man-db (2.9.1-1) ...

代码

文本

使用GNINA进行分子对接

GNINA的基本使用

在命令行输入 gnina --help，可以得到它的使用说明

代码

文本

[4]

!cp ./gnina /usr/bin/

!gnina --help

Input:
  -r [ --receptor ] arg            rigid part of the receptor
  --flex arg                       flexible side chains, if any (PDBQT)
  -l [ --ligand ] arg              ligand(s)
  --flexres arg                    flexible side chains specified by comma 
                                   separated list of chain:resid
  --flexdist_ligand arg            Ligand to use for flexdist
  --flexdist arg                   set all side chains within specified 
                                   distance to flexdist_ligand to flexible
  --flex_limit arg                 Hard limit for the number of flexible 
                                   residues
  --flex_max arg                   Retain at at most the closest flex_max 
                                   flexible residues

Search space (required):
  --center_x arg                   X coordinate of the center
  --center_y arg                   Y coordinate of the center
  --center_z arg                   Z coordinate of the center
  --size_x arg                     size in the X dimension (Angstroms)
  --size_y arg                     size in the Y dimension (Angstroms)
  --size_z arg                     size in the Z dimension (Angstroms)
  --autobox_ligand arg             Ligand to use for autobox
  --autobox_add arg                Amount of buffer space to add to 
                                   auto-generated box (default +4 on all six 
                                   sides)
  --autobox_extend arg (=1)        Expand the autobox if needed to ensure the 
                                   input conformation of the ligand being 
                                   docked can freely rotate within the box.
  --no_lig                         no ligand; for sampling/minimizing flexible 
                                   residues

Scoring and minimization options:
  --scoring arg                    specify alternative built-in scoring 
                                   function: ad4_scoring default dkoes_fast 
                                   dkoes_scoring dkoes_scoring_old vina vinardo
  --custom_scoring arg             custom scoring function file
  --custom_atoms arg               custom atom type parameters file
  --score_only                     score provided ligand pose
  --local_only                     local search only using autobox (you 
                                   probably want to use --minimize)
  --minimize                       energy minimization
  --randomize_only                 generate random poses, attempting to avoid 
                                   clashes
  --num_mc_steps arg               fixed number of monte carlo steps to take in
                                   each chain
  --max_mc_steps arg               cap on number of monte carlo steps to take 
                                   in each chain
  --num_mc_saved arg               number of top poses saved in each monte 
                                   carlo chain
  --minimize_iters arg (=0)        number iterations of steepest descent; 
                                   default scales with rotors and usually isn't
                                   sufficient for convergence
  --accurate_line                  use accurate line search
  --simple_ascent                  use simple gradient ascent
  --minimize_early_term            Stop minimization before convergence 
                                   conditions are fully met.
  --minimize_single_full           During docking perform a single full 
                                   minimization instead of a truncated 
                                   pre-evaluate followed by a full.
  --approximation arg              approximation (linear, spline, or exact) to 
                                   use
  --factor arg                     approximation factor: higher results in a 
                                   finer-grained approximation
  --force_cap arg                  max allowed force; lower values more gently 
                                   minimize clashing structures
  --user_grid arg                  Autodock map file for user grid data based 
                                   calculations
  --user_grid_lambda arg (=-1)     Scales user_grid and functional scoring
  --print_terms                    Print all available terms with default 
                                   parameterizations
  --print_atom_types               Print all available atom types

Convolutional neural net (CNN) scoring:
  --cnn_scoring arg (=1)           Amount of CNN scoring: none, rescore 
                                   (default), refinement, all
  --cnn arg                        built-in model to use, specify 
                                   PREFIX_ensemble to evaluate an ensemble of 
                                   models starting with PREFIX: 
                                   crossdock_default2018 crossdock_default2018_
                                   1 crossdock_default2018_2 
                                   crossdock_default2018_3 
                                   crossdock_default2018_4 default2017 dense 
                                   dense_1 dense_2 dense_3 dense_4 
                                   general_default2018 general_default2018_1 
                                   general_default2018_2 general_default2018_3 
                                   general_default2018_4 redock_default2018 
                                   redock_default2018_1 redock_default2018_2 
                                   redock_default2018_3 redock_default2018_4
  --cnn_model arg                  caffe cnn model file; if not specified a 
                                   default model will be used
  --cnn_weights arg                caffe cnn weights file (*.caffemodel); if 
                                   not specified default weights (trained on 
                                   the default model) will be used
  --cnn_resolution arg (=0.5)      resolution of grids, don't change unless you
                                   really know what you are doing
  --cnn_rotation arg (=0)          evaluate multiple rotations of pose (max 24)
  --cnn_update_min_frame           During minimization, recenter coordinate 
                                   frame as ligand moves
  --cnn_freeze_receptor            Don't move the receptor with respect to a 
                                   fixed coordinate system
  --cnn_mix_emp_force              Merge CNN and empirical minus forces
  --cnn_mix_emp_energy             Merge CNN and empirical energy
  --cnn_empirical_weight arg (=1)  Weight for scaling and merging empirical 
                                   force and energy 
  --cnn_outputdx                   Dump .dx files of atom grid gradient.
  --cnn_outputxyz                  Dump .xyz files of atom gradient.
  --cnn_xyzprefix arg (=gradient)  Prefix for atom gradient .xyz files
  --cnn_center_x arg               X coordinate of the CNN center
  --cnn_center_y arg               Y coordinate of the CNN center
  --cnn_center_z arg               Z coordinate of the CNN center
  --cnn_verbose                    Enable verbose output for CNN debugging

Output:
  -o [ --out ] arg                 output file name, format taken from file 
                                   extension
  --out_flex arg                   output file for flexible receptor residues
  --log arg                        optionally, write log file
  --atom_terms arg                 optionally write per-atom interaction term 
                                   values
  --atom_term_data                 embedded per-atom interaction terms in 
                                   output sd data
  --pose_sort_order arg (=0)       How to sort docking results: CNNscore 
                                   (default), CNNaffinity, Energy

Misc (optional):
  --cpu arg                        the number of CPUs to use (the default is to
                                   try to detect the number of CPUs or, failing
                                   that, use 1)
  --seed arg                       explicit random seed
  --exhaustiveness arg (=8)        exhaustiveness of the global search (roughly
                                   proportional to time)
  --num_modes arg (=9)             maximum number of binding modes to generate
  --min_rmsd_filter arg (=1)       rmsd value used to filter final poses to 
                                   remove redundancy
  -q [ --quiet ]                   Suppress output messages
  --addH arg                       automatically add hydrogens in ligands (on 
                                   by default)
  --stripH arg                     remove hydrogens from molecule _after_ 
                                   performing atom typing for efficiency (off 
                                   by default)
  --device arg (=0)                GPU device to use
  --no_gpu                         Disable GPU acceleration, even if available.

Configuration file (optional):
  --config arg                     the above options can be put here

Information (optional):
  --help                           display usage summary
  --help_hidden                    display usage summary with hidden options
  --version                        display program version

代码

文本

受体、配体准备

从PDB数据库下载案例

代码

文本

[5]

!wget https://files.rcsb.org/download/4N14.pdb

--2023-09-24 22:17:45--  https://files.rcsb.org/download/4N14.pdb
Resolving ga.dp.tech (ga.dp.tech)... 10.255.254.37, 10.255.254.7, 10.255.254.18
Connecting to ga.dp.tech (ga.dp.tech)|10.255.254.37|:8118... connected.
Proxy request sent, awaiting response... 200 OK
Length: unspecified [application/octet-stream]
Saving to: ‘4N14.pdb.1’

4N14.pdb.1              [     <=>            ] 613.12K  8.39KB/s    in 62s     

2023-09-24 22:18:48 (9.93 KB/s) - ‘4N14.pdb.1’ saved [627831]

代码

文本

分割复合物结构，得到受体和配体文件

代码

文本

[6]

!grep ATOM 4N14.pdb > rec.pdb

代码

文本

[8]

!grep WR7 4N14.pdb > lig.pdb

代码

文本

使用py3Dmol可视化观察复合物构象

代码

文本

[9]

import py3Dmol

v = py3Dmol.view(height=400)

v.addModel(open('rec.pdb').read())

v.setStyle({'cartoon':{},'stick':{'radius':0.15}})

v.addModel(open('lig.pdb').read())

v.setStyle({'model':1},{'stick':{'colorscheme':'greenCarbon'}})

v.zoomTo({'model':1})

<py3Dmol.view at 0x7f531f475b80>

代码

文本

进行对接

--autobox_ligand 指定结合口袋位置
--seed 指定随机种子
-o 指定保存对接构象的输出文件

代码

文本

[10]

!./gnina -r rec.pdb -l lig.pdb --autobox_ligand lig.pdb --seed 0 -o docked.sdf.gz

              _             
             (_)            
   __ _ _ __  _ _ __   __ _ 
  / _` | '_ \| | '_ \ / _` |
 | (_| | | | | | | | | (_| |
  \__, |_| |_|_|_| |_|\__,_|
   __/ |                    
  |___/                     

gnina  master:e9cb230+   Built Feb 11 2023.
gnina is based on smina and AutoDock Vina.
Please cite appropriately.

Commandline: ./gnina -r rec.pdb -l lig.pdb --autobox_ligand lig.pdb --seed 0 -o docked.sdf.gz
Using random seed: 0

0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************

mode |  affinity  |    CNN     |   CNN
     | (kcal/mol) | pose score | affinity
-----+------------+------------+----------
    1       -5.87       0.9409      6.264
    2       -6.25       0.9030      6.212
    3       -6.31       0.8942      5.749
    4       -6.08       0.8847      5.973
    5       -5.61       0.8806      6.142
    6       -5.96       0.8585      6.096
    7       -6.21       0.8458      6.261
    8       -5.77       0.8334      5.611
    9       -5.24       0.8027      5.830

代码

文本

GNINA对化合物构象会给出两个打分，CNNscore和CNNaffnity，默认根据CNNscore对构象进行排序。

CNNscore 化合物构象是一个“好构象”（RMSD < 2）的概率

CNNaffnity 化合物构象的亲和力 - 1 $μ M$ is 6, 1 $n M$ is 9

代码

文本

可视化对接结果

代码

文本

[14]

import gzip

v = py3Dmol.view(height=400)

v.addModel(open('rec.pdb').read())

v.setStyle({'cartoon':{},'stick':{'radius':.1}})

v.addModel(open('lig.pdb').read())

v.setStyle({'model':1},{'stick':{'colorscheme':'dimgrayCarbon','radius':.125}})

v.addModelsAsFrames(gzip.open('docked.sdf.gz','rt').read())

v.setStyle({'model':2},{'stick':{'colorscheme':'greenCarbon'}})

v.animate({'interval':1000}); v.zoomTo({'model':1}); v.rotate(90)

<py3Dmol.view at 0x7f531f421790>

代码

文本

计算预测结果与真实结合构象的RMSD数值

代码

文本

[11]

!obrms --firstonly lig.pdb docked.sdf.gz

RMSD lig.pdb: 2.31299
RMSD lig.pdb: 2.8954
RMSD lig.pdb: 3.78748
RMSD lig.pdb: 5.74708
RMSD lig.pdb: 1.86761
RMSD lig.pdb: 1.00173
RMSD lig.pdb: 2.98492
RMSD lig.pdb: 4.01599
RMSD lig.pdb: 2.06565

代码

文本

通过计算对接构象和真实结合构象的RMSD值，可以发现使用CNNscore进行排序得到的top1构象并不是一个好构象，RMSD大于2，而最优的预测构象（RMSD=1.00）排在第6位。

代码

文本

分子对接

已赞1