新建
Uni-Mol分子向量表征
zhengh@dp.tech
推荐镜像 :unimol-qsar:v0.2
推荐机型 :c12_m92_1 * NVIDIA V100
赞 6
5
12
目录
基于Uni-Mol的分子向量表征
©️ Copyright 2023 @ Authors
作者:
高志锋 📨
日期:2023-06-06
共享协议:本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。
快速开始:点击上方的 开始连接 按钮,选择 unimol-qsar:v0.2镜像及任意GPU节点配置,稍等片刻即可运行。
代码
文本
[1]
# 导入unimol
from unimol import UniMolRepr
import numpy as np
import pandas as pd
代码
文本
[2]
# single smiles unimol representation
clf = UniMolRepr(data_type='molecule')
smiles = 'c1ccc(cc1)C2=NCC(=O)Nc3c2cc(cc3)[N+](=O)[O]'
smiles_list = [smiles]
unimol_repr = np.array(clf.get_repr(smiles_list)["cls_repr"])
print(unimol_repr.shape)
2023-06-12 16:10:28 | unimol/models/unimol.py | 107 | INFO | Uni-Mol(QSAR) | Loading pretrained weights from /opt/conda/lib/python3.8/site-packages/unimol-0.0.2-py3.8.egg/unimol/weights/mol_pre_all_h_220816.pt 2023-06-12 16:10:30 | unimol/data/conformer.py | 56 | INFO | Uni-Mol(QSAR) | Start generating conformers... 1it [00:00, 16.50it/s] 2023-06-12 16:10:30 | unimol/data/conformer.py | 60 | INFO | Uni-Mol(QSAR) | Failed to generate conformers for 0.00% of molecules. 2023-06-12 16:10:30 | unimol/data/conformer.py | 62 | INFO | Uni-Mol(QSAR) | Failed to generate 3d conformers for 0.00% of molecules. (1, 512)
代码
文本
[3]
%%bash
# 下载样例数据, CNS drug data
rm -rf mol_train.csv
wget -nv https://bohrium-example.oss-cn-zhangjiakou.aliyuncs.com/unimol-qsar/mol_train.csv
2023-06-12 16:10:30 URL:https://bohrium-example.oss-cn-zhangjiakou.aliyuncs.com/unimol-qsar/mol_train.csv [30600/30600] -> "mol_train.csv" [1]
代码
文本
[4]
smiles_list = pd.read_csv('mol_train.csv')['SMILES'].to_list()
y = pd.read_csv('mol_train.csv')['TARGET'].to_list()
unimol_repr_list = np.array(clf.get_repr(smiles_list)["cls_repr"])
2023-06-12 16:10:31 | unimol/data/conformer.py | 56 | INFO | Uni-Mol(QSAR) | Start generating conformers... 700it [00:10, 66.34it/s] 2023-06-12 16:10:41 | unimol/data/conformer.py | 60 | INFO | Uni-Mol(QSAR) | Failed to generate conformers for 0.00% of molecules. 2023-06-12 16:10:41 | unimol/data/conformer.py | 62 | INFO | Uni-Mol(QSAR) | Failed to generate 3d conformers for 0.00% of molecules.
代码
文本
[5]
print(unimol_repr_list.shape)
(700, 512)
代码
文本
[6]
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from sklearn.decomposition import PCA
代码
文本
[7]
pca = PCA(n_components=2)
X_reduced = pca.fit_transform(unimol_repr_list)
代码
文本
[8]
# 可视化
colors = ['r', 'g', 'b']
markers = ['s', 'o', 'D']
labels = ['Target:0','Target:1']
plt.figure(figsize=(8, 6))
for label, color, marker in zip(np.unique(y), colors, markers):
plt.scatter(X_reduced[y == label, 0],
X_reduced[y == label, 1],
c=color,
marker=marker,
label=labels[label],
edgecolors='black')
plt.xlabel('Principal Component 1')
plt.ylabel('Principal Component 2')
plt.legend(loc='best')
plt.title('Unimol Repr')
plt.show()
代码
文本
[ ]
代码
文本
已赞6
本文被以下合集收录
uni-mol
钵钵鸡
更新于 2024-06-17
3 篇0 人关注
比赛
微信用户Qdgk
更新于 2024-04-03
1 篇0 人关注
推荐阅读
公开
Uni-MolOvO
更新于 2024-08-26
1 转存文件
公开
Uni-MOF:MOF材料吸附全相图预测工具zhengh@dp.tech
更新于 2024-07-23
2 赞7 转存文件
评论
昌珺涵
爱学习的王一博