Drug-target binding affinity predict with transformerCPI

空间站广场

论文

Notebooks

比赛

课程

Apps

我的主页

我的Notebooks

我的论文库

我的足迹

我的工作空间

任务

节点

文件

数据集

镜像

项目

数据库

公开

Drug-target binding affinity predict with transformerCPI

AI4S

nickkk

发布于 2023-08-03

推荐镜像 :tcpi:notebook

推荐机型 :c12_m92_1 * NVIDIA V100

Download source code from github

repalce one amino-acid in a protein sequence to simulate amino acid mutation, then observe the influence with a heatmap

点击：开始链接
选择gpu镜像 tcpi:notebook

代码

文本

Download source code from github

代码

文本

[1]

! git clone https://github.com/lifanchen-simm/transformerCPI2.0.git

Cloning into 'transformerCPI2.0'...
remote: Enumerating objects: 77, done.
remote: Counting objects: 100% (77/77), done.
remote: Compressing objects: 100% (76/76), done.
remote: Total 77 (delta 36), reused 2 (delta 0), pack-reused 0
Unpacking objects: 100% (77/77), 1.35 MiB | 680.00 KiB/s, done.

代码

文本

rename

代码

文本

[3]

! mv transformerCPI2.0/ tcpi

代码

文本

copy prepared checkpoint

代码

文本

[4]

! cp /root/transformerCPI2.0/tcpi.pt .

代码

文本

select kernel tcpi from upper right

代码

文本

[2]

import sys

sys.path.append('./tcpi/')

代码

文本

[3]

import torch

from predict import pack

from featurizer import featurizer

model = torch.load('tcpi.pt').to(0)

代码

文本

[4]

sequence = "MPHSSLHPSIPCPRGHGAQKAALVLLSACLVTLWGLGEPPEHTLRYLVLHLA" # Example protein sequence

smiles = "CS(=O)(C1=NN=C(S1)CN2C3CCC2C=C(C4=CC=CC=C4)C3)=O" # Example compound

代码

文本

Featurizer the data

代码

文本

[5]

compounds, adjacencies, proteins = featurizer(smiles, sequence)

代码

文本

predict

代码

文本

[6]

dataset = list(zip(compounds, adjacencies, proteins))

model.eval()

with torch.no_grad():

for data in dataset:

adjs, atoms, proteins = [], [], []

atom, adj, protein= data

adjs.append(adj)

atoms.append(atom)

proteins.append(protein)

data = pack(atoms,adjs,proteins, 0) ### do some data transfer

predicted_scores = model(data)

print(predicted_scores)

[0.6602165]

代码

文本

[7]

def return_pred(seq, smiles, tester):

compounds, adjacencies, proteins = featurizer(smiles, seq)

test_set = list(zip(compounds, adjacencies, proteins))

score = float(tester.test(test_set))

return score

代码

文本

[11]

import numpy as np

device = 0

class Tester(object):

def __init__(self, model,device):

self.model = model

self.device = device

def test(self, dataset):

self.model.eval()

with torch.no_grad():

for data in dataset:

adjs, atoms, proteins = [], [], []

atom, adj, protein= data

adjs.append(adj)

atoms.append(atom)

proteins.append(protein)

data = pack(atoms,adjs,proteins, self.device)

predicted_scores = self.model(data)

return predicted_scores

tester = Tester(model, device)

代码

文本

repalce one amino-acid in a protein sequence to simulate amino acid mutation, then observe the influence with a heatmap

代码

文本

[13]

mutation = 'ARNDCQEGHILKMFPSTWYV'

original_score = predicted_scores

n = len(sequence)

delta_score = np.zeros((n,20))

for i in range(n):

k = 0

for m in mutation:

sequence_2 = list(sequence)

sequence_2[i] = m

sequence_2 = ''.join(sequence_2)

score = return_pred(sequence_2, smiles, tester)

delta_score[i,k] = np.abs(original_score - score)

# print(delta_score[i,k])

k += 1

代码

文本

[29]

import pandas as pd

pd_data=pd.DataFrame(delta_score,index=list(sequence),columns=list(mutation))

代码

文本

[30]

import seaborn

seaborn.heatmap(pd_data)

<Axes: >

代码

文本

AI4S

点个赞吧