Bohrium
robot
新建

空间站广场

论文
Notebooks
比赛
课程
Apps
我的主页
我的Notebooks
我的论文库
我的足迹

我的工作空间

任务
节点
文件
数据集
镜像
项目
数据库
公开
Uni-Mol性质预测-回归任务-电解液分子的介电常数
Uni-Mol
Deep Learning
Uni-MolDeep Learning
Letian
zengboshen@dp.tech
更新于 2024-10-30
推荐镜像 :unie:0812
推荐机型 :c3_m4_1 * NVIDIA T4
2
Uni-Mol性质预测实战-回归任务-电解液分子的介电常数
AIMS:
案例背景
Step0: 安装Uni-Mol Tools
Step1: 读入数据
Step2: 导入Uni-Mol
Step3: 输入数据并训练
Step4: 微调参数(略)
Step5: 读入分子构象用于预测介电常数

Uni-Mol性质预测实战-回归任务-电解液分子的介电常数

©️ Copyright 2023 @ Authors
作者: 郭文韬 📨 汪鸿帅 📨
日期:2023-06-06
共享协议:本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。
快速开始:点击上方的 开始连接 按钮,选择 unimol-qsar:v0.4镜像及任意GPU节点配置,稍等片刻即可运行。

代码
文本

AIMS:

  • Uni-Mol具体场景下的应用实战
  • 理解Uni-Mol的工作模块
  • 在以分子坐标为输入的前提下进行模型训练, 预测物化性质

案例背景

  • 介电常数(Dielectric constant, also called the relative permittivity)是一个物理量,用于描述材料在电场中的极化能力。它是一个无量纲数值,可以帮助我们了解一个介质对电场的响应程度。
    介电常数也被称为相对静态介电常数,用符号表示。在公式中,绝对介电常数(在国际单位制中,以法拉/米为单位)等于相对静态介电常数εr乘以真空中的介电常数(约为

  • 电解液分子的介电常数是一个衡量电解质溶液中分子对电场的响应程度的物理量。它对电解质溶液的性质以及电化学过程有很大影响。会影响离子溶解度, 离子迁移率, 电解质溶液的导电性,电解反应的活化能, 配位反应的稳定性等。不同的应用场景需要选择具有适当介电常数的电解液,以满足特定的性能要求。

  • 在这个案例中, 我们将以Uni-Mol预测分子介电常数来:

    1. 学习以分子坐标代替SMILE字符串为输入的训练方法
    2. 使用回归模型来预测连续数值
    3. 使用训练出的模型预测一些分子的介电常数
代码
文本

Step0: 安装Uni-Mol Tools

首先,使用pip install安装一下unimol_tools

代码
文本
[3]
!pip install unimol_tools
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting unimol_tools
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/49/02/01b92f2a35425ccfd7675bf3ab6f0a45e6b0e9ff3e95c420ae062801af66/unimol_tools-0.1.0.post4-py3-none-any.whl (51 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 51.1/51.1 kB 3.2 MB/s eta 0:00:00
Requirement already satisfied: tqdm in /opt/conda/lib/python3.8/site-packages (from unimol_tools) (4.64.1)
Requirement already satisfied: numpy<2.0.0,>=1.22.4 in /opt/conda/lib/python3.8/site-packages (from unimol_tools) (1.22.4)
Requirement already satisfied: joblib in /opt/conda/lib/python3.8/site-packages (from unimol_tools) (1.2.0)
Requirement already satisfied: torch in /opt/conda/lib/python3.8/site-packages (from unimol_tools) (1.13.1+cu116)
Requirement already satisfied: pandas<2.0.0 in /opt/conda/lib/python3.8/site-packages (from unimol_tools) (1.5.3)
Collecting addict
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/6a/00/b08f23b7d7e1e14ce01419a467b583edbb93c6cdb8654e54a9cc579cd61f/addict-2.4.0-py3-none-any.whl (3.8 kB)
Requirement already satisfied: scikit-learn in /opt/conda/lib/python3.8/site-packages (from unimol_tools) (1.0.2)
Requirement already satisfied: pyyaml in /opt/conda/lib/python3.8/site-packages (from unimol_tools) (6.0)
Collecting rdkit
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/3d/84/63b2e66f5c7cb97ce994769afbbef85a1ac364fedbcb7d4a3c0f15d318a5/rdkit-2024.3.5-cp38-cp38-manylinux_2_28_x86_64.whl (33.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 33.1/33.1 MB 10.6 MB/s eta 0:00:0000:0100:01
Requirement already satisfied: python-dateutil>=2.8.1 in /opt/conda/lib/python3.8/site-packages (from pandas<2.0.0->unimol_tools) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /opt/conda/lib/python3.8/site-packages (from pandas<2.0.0->unimol_tools) (2022.7)
Requirement already satisfied: Pillow in /opt/conda/lib/python3.8/site-packages (from rdkit->unimol_tools) (9.4.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in /opt/conda/lib/python3.8/site-packages (from scikit-learn->unimol_tools) (3.1.0)
Requirement already satisfied: scipy>=1.1.0 in /opt/conda/lib/python3.8/site-packages (from scikit-learn->unimol_tools) (1.7.3)
Requirement already satisfied: typing-extensions in /opt/conda/lib/python3.8/site-packages (from torch->unimol_tools) (4.5.0)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.8/site-packages (from python-dateutil>=2.8.1->pandas<2.0.0->unimol_tools) (1.16.0)
Installing collected packages: addict, rdkit, unimol_tools
Successfully installed addict-2.4.0 rdkit-2024.3.5 unimol_tools-0.1.0.post4
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
代码
文本
[4]
!pip install --upgrade numpy
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Requirement already satisfied: numpy in /opt/conda/lib/python3.8/site-packages (1.22.4)
Collecting numpy
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/98/5d/5738903efe0ecb73e51eb44feafba32bdba2081263d40c5043568ff60faf/numpy-1.24.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 17.3/17.3 MB 5.3 MB/s eta 0:00:0000:0100:01
Installing collected packages: numpy
  Attempting uninstall: numpy
    Found existing installation: numpy 1.22.4
    Uninstalling numpy-1.22.4:
      Successfully uninstalled numpy-1.22.4
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
scipy 1.7.3 requires numpy<1.23.0,>=1.16.5, but you have numpy 1.24.4 which is incompatible.
numba 0.56.4 requires numpy<1.24,>=1.18, but you have numpy 1.24.4 which is incompatible.
moviepy 0.2.3.5 requires decorator<5.0,>=4.0.2, but you have decorator 5.1.1 which is incompatible.
f90wrap 0.2.12 requires numpy<1.24,>=1.13, but you have numpy 1.24.4 which is incompatible.
cvxpy 1.2.3 requires setuptools<=64.0.2, but you have setuptools 65.6.3 which is incompatible.
Successfully installed numpy-1.24.4
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
代码
文本

运行下面的代码之前请先点击右上角的重启kernel

代码
文本

Step1: 读入数据

代码
文本

这个时候有的同学可能就会问,什么是pkl文件呢? 在之前的BBBP场景中我们的数据文件是常用的CSV文件,可以用EXCEL轻松的可视化.Pickle(常常以.pkl作为文件扩展名)和CSV是两种常用的数据存储格式,它们各自有自己的优点和适用场景。

如果你对pkl文件格式并不熟悉,那就和我一起来学习为什么在分子坐标场景下使用pkl数据打包方式吧!

  • Pickle是Python特有的二进制序列化格式,可以方便地存储几乎所有类型的Python对象,包括自定义类、函数、模块等。这意味着你可以将一个复杂的数据结构(如列表、字典、集合、numpy数组等)直接保存为一个Pickle文件,然后在需要时加载回来,而无需做任何额外的处理。因此,Pickle非常适合存储机器学习模型等复杂对象。当然,原子类型(N),分子坐标(3N)对应一个预测值(1)就是典型的“多对一“数据结构场景,pickle文件能够更好的帮我们打包这些数据。

  • 相比之下,CSV(逗号分隔值)是一种简单的文本格式,主要用于存储表格数据。CSV文件的每一行对应于表格的一行,每个字段由逗号分隔。由于CSV是纯文本格式,因此它具有很好的兼容性和可读性,可以被几乎所有的数据处理软件和编程语言读取。然而,CSV只能存储二维的表格数据,不能直接存储更复杂的数据结构。那么针对一个SMILE字符串对应一个预测值的任务类型,CSV文件可以更方便的编辑和可视化,同时储存这种简单的”一对一“数据结构。

  • 所以,对于需要存储复杂数据结构的情况,Pickle是更好的选择。而对于需要人工查看或者与其他软件或编程语言共享的二维表格数据,CSV可能是更好的选择。

代码
文本
[1]
!wget -P ./data/ https://dp-public.oss-cn-beijing.aliyuncs.com/community/courses/eps_data_test.pkl
!wget -P ./data/ https://dp-public.oss-cn-beijing.aliyuncs.com/community/courses/eps_data_train.pkl
--2024-10-30 18:50:30--  https://dp-public.oss-cn-beijing.aliyuncs.com/community/courses/eps_data_test.pkl
Resolving ga.dp.tech (ga.dp.tech)... 10.255.254.18, 10.255.254.37, 10.255.254.7
Connecting to ga.dp.tech (ga.dp.tech)|10.255.254.18|:8118... connected.
Proxy request sent, awaiting response... 200 OK
Length: 36644 (36K) [application/octet-stream]
Saving to: ‘./data/eps_data_test.pkl’

eps_data_test.pkl   100%[===================>]  35.79K  --.-KB/s    in 0.002s  

2024-10-30 18:50:32 (18.4 MB/s) - ‘./data/eps_data_test.pkl’ saved [36644/36644]

--2024-10-30 18:50:32--  https://dp-public.oss-cn-beijing.aliyuncs.com/community/courses/eps_data_train.pkl
Resolving ga.dp.tech (ga.dp.tech)... 10.255.254.18, 10.255.254.37, 10.255.254.7
Connecting to ga.dp.tech (ga.dp.tech)|10.255.254.18|:8118... connected.
Proxy request sent, awaiting response... 200 OK
Length: 173342 (169K) [application/octet-stream]
Saving to: ‘./data/eps_data_train.pkl’

eps_data_train.pkl  100%[===================>] 169.28K  --.-KB/s    in 0.04s   

2024-10-30 18:50:34 (3.71 MB/s) - ‘./data/eps_data_train.pkl’ saved [173342/173342]

代码
文本
[3]
import pickle # 导入pickle来undump文件
f = open('./data/eps_data_train.pkl', 'rb') # 'rb' for reading binary; can be omitted
eps_train = pickle.load(f) # 以字典“dict”载入pkl文件
f.close()
print(eps_train.keys()) # 输出训练数据的关键词
dict_keys(['target', 'atoms', 'coord'])
代码
文本

Step2: 导入Uni-Mol

代码
文本
[1]
from unimol_tools import MolTrain, MolPredict
import numpy as np
/opt/conda/lib/python3.8/site-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.16.5 and <1.23.0 is required for this version of SciPy (detected version 1.24.4
  warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
2024-10-30 18:51:27 | unimol_tools/weights/weighthub.py | 17 | INFO | Uni-Mol Tools | Weights will be downloaded to default directory: /opt/conda/lib/python3.8/site-packages/unimol_tools/weights
代码
文本

Step3: 输入数据并训练

  • 注意数据的类型格式需要转换为custom_data可读的类型eps_train["target"]
代码
文本
[4]
# 先查看一下数据中包含的元素数量
print(eps_train["target"].shape)
print(eps_train["atoms"].shape)
print(eps_train["coord"].shape)

# 分别查看一下三类数据前面5个元素
print("target = \n ",eps_train["target"][:5])
print("atoms = \n", eps_train["atoms"][:5])
print("coord = \n", eps_train["coord"][:5])
(500,)
(500,)
(500,)
target = 
  [2.3152 2.2284 2.2085 2.202  2.1969]
atoms = 
 0     [B, O, C, O, C, O, C, H, H, H, H, H, H, H, H, H]
1    [B, O, C, C, C, C, O, C, C, C, C, O, C, C, C, ...
2    [B, O, C, C, C, C, C, C, O, C, C, C, C, C, C, ...
3    [B, O, C, C, C, C, C, C, C, O, C, C, C, C, C, ...
4    [B, O, C, C, C, C, C, C, C, C, O, C, C, C, C, ...
Name: 0, dtype: object
coord = 
 0    [[0.06362793, 0.11435036, 0.4628148], [-0.8563...
1    [[0.25843996, -0.104826115, 0.990576], [1.6314...
2    [[-0.16161238, -0.11499522, -0.22042371], [-0....
3    [[-1.0876278, 0.6680897, -0.10026416], [-1.939...
4    [[-0.0011945401, -0.18294899, -0.736398], [-0....
Name: 1, dtype: object
代码
文本
[6]
import os
os.environ['HTTP_PROXY'] = 'http://ga.dp.tech:8118'
os.environ['HTTPS_PROXY'] = 'http://ga.dp.tech:8118'
代码
文本
[7]
# 数据格式转换
custom_data ={'target':eps_train["target"],
'atoms':eps_train["atoms"].to_list(), # atoms 需要转换为list格式
'coordinates':eps_train["coord"].to_list(), # coordinates 需要转换为list格式
'target_scaler':"none"}

# 调用uni-mol进行模型训练
clf = MolTrain(task='regression', # 回归任务
data_type='molecule',
epochs=50, # 模型迭代次数
learning_rate=0.0001, # 学习率,每次迭代中参数调整的幅度
batch_size=16,
early_stopping=6,
metrics='mse',
split='random',
save_path='./data/eps_train', # 模型存储路径
)
clf.fit(custom_data)
2024-10-30 18:52:15 | unimol_tools/data/datareader.py | 188 | INFO | Uni-Mol Tools | Anomaly clean with 3 sigma threshold: 500 -> 488
2024-10-30 18:52:15 | unimol_tools/weights/weighthub.py | 33 | INFO | Uni-Mol Tools | Downloading mol.dict.txt
2024-10-30 18:52:17 | unimol_tools/train.py | 172 | INFO | Uni-Mol Tools | Output directory already exists: ./data/eps_train
2024-10-30 18:52:17 | unimol_tools/train.py | 173 | INFO | Uni-Mol Tools | Warning: Overwrite output directory: ./data/eps_train
2024-10-30 18:52:18 | unimol_tools/weights/weighthub.py | 33 | INFO | Uni-Mol Tools | Downloading mol_pre_all_h_220816.pt
2024-10-30 18:52:39 | unimol_tools/models/unimol.py | 120 | INFO | Uni-Mol Tools | Loading pretrained weights from /opt/conda/lib/python3.8/site-packages/unimol_tools/weights/mol_pre_all_h_220816.pt
2024-10-30 18:52:39 | unimol_tools/models/nnmodel.py | 142 | INFO | Uni-Mol Tools | start training Uni-Mol:unimolv1
2024-10-30 18:52:50 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [1/50] train_loss: 0.9487, val_loss: 1.4193, val_mse: 239.3661, lr: 0.000067, 8.3s
2024-10-30 18:52:53 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [2/50] train_loss: 0.7192, val_loss: 0.9711, val_mse: 162.9599, lr: 0.000099, 3.1s
2024-10-30 18:52:57 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [3/50] train_loss: 0.4907, val_loss: 0.5596, val_mse: 95.2991, lr: 0.000097, 3.0s
2024-10-30 18:53:01 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [4/50] train_loss: 0.3498, val_loss: 0.3798, val_mse: 62.6049, lr: 0.000095, 3.2s
2024-10-30 18:53:05 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [5/50] train_loss: 0.2490, val_loss: 0.2616, val_mse: 41.5696, lr: 0.000093, 3.1s
2024-10-30 18:53:08 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [6/50] train_loss: 0.2007, val_loss: 0.1590, val_mse: 25.4164, lr: 0.000091, 3.2s
2024-10-30 18:53:12 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [7/50] train_loss: 0.1847, val_loss: 0.1616, val_mse: 26.3675, lr: 0.000089, 3.2s
2024-10-30 18:53:16 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [8/50] train_loss: 0.1695, val_loss: 0.4549, val_mse: 66.6184, lr: 0.000087, 3.7s
2024-10-30 18:53:19 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [9/50] train_loss: 0.2044, val_loss: 0.0980, val_mse: 16.3581, lr: 0.000085, 3.3s
2024-10-30 18:53:23 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [10/50] train_loss: 0.1089, val_loss: 0.2026, val_mse: 29.2137, lr: 0.000082, 2.6s
2024-10-30 18:53:26 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [11/50] train_loss: 0.1157, val_loss: 0.2097, val_mse: 33.9684, lr: 0.000080, 3.3s
2024-10-30 18:53:29 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [12/50] train_loss: 0.1556, val_loss: 0.1289, val_mse: 21.0683, lr: 0.000078, 3.6s
2024-10-30 18:53:33 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [13/50] train_loss: 0.1155, val_loss: 0.1449, val_mse: 23.1306, lr: 0.000076, 3.4s
2024-10-30 18:53:36 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [14/50] train_loss: 0.1097, val_loss: 0.2290, val_mse: 38.9350, lr: 0.000074, 3.5s
2024-10-30 18:53:40 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [15/50] train_loss: 0.0913, val_loss: 0.1825, val_mse: 29.4327, lr: 0.000072, 3.5s
2024-10-30 18:53:40 | unimol_tools/utils/metrics.py | 234 | WARNING | Uni-Mol Tools | Early stopping at epoch: 15
2024-10-30 18:53:40 | unimol_tools/tasks/trainer.py | 300 | INFO | Uni-Mol Tools | load model success!
2024-10-30 18:53:40 | unimol_tools/models/nnmodel.py | 168 | INFO | Uni-Mol Tools | fold 0, result {'mse': 16.358067, 'mae': 2.902657, 'pearsonr': 0.967632626794024, 'spearmanr': 0.9231102723843637, 'r2': 0.9277723130335606}
2024-10-30 18:53:41 | unimol_tools/models/unimol.py | 120 | INFO | Uni-Mol Tools | Loading pretrained weights from /opt/conda/lib/python3.8/site-packages/unimol_tools/weights/mol_pre_all_h_220816.pt
2024-10-30 18:53:44 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [1/50] train_loss: 0.9966, val_loss: 0.5797, val_mse: 85.4779, lr: 0.000067, 2.9s
2024-10-30 18:53:48 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [2/50] train_loss: 0.6277, val_loss: 0.6311, val_mse: 96.8549, lr: 0.000099, 3.5s
2024-10-30 18:53:51 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [3/50] train_loss: 0.6181, val_loss: 0.1901, val_mse: 23.9874, lr: 0.000097, 3.6s
2024-10-30 18:53:55 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [4/50] train_loss: 0.3189, val_loss: 0.2753, val_mse: 46.2619, lr: 0.000095, 2.8s
2024-10-30 18:53:58 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [5/50] train_loss: 0.4863, val_loss: 0.3716, val_mse: 36.9568, lr: 0.000093, 3.7s
2024-10-30 18:54:02 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [6/50] train_loss: 0.3001, val_loss: 0.3793, val_mse: 54.2588, lr: 0.000091, 3.3s
2024-10-30 18:54:05 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [7/50] train_loss: 0.2459, val_loss: 0.1207, val_mse: 19.8224, lr: 0.000089, 3.8s
2024-10-30 18:54:09 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [8/50] train_loss: 0.2451, val_loss: 0.0913, val_mse: 15.5071, lr: 0.000087, 2.9s
2024-10-30 18:54:13 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [9/50] train_loss: 0.1476, val_loss: 0.1898, val_mse: 19.6942, lr: 0.000085, 3.1s
2024-10-30 18:54:16 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [10/50] train_loss: 0.1945, val_loss: 0.0696, val_mse: 11.0326, lr: 0.000082, 3.7s
2024-10-30 18:54:20 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [11/50] train_loss: 0.1466, val_loss: 0.1104, val_mse: 16.2020, lr: 0.000080, 3.0s
2024-10-30 18:54:24 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [12/50] train_loss: 0.1532, val_loss: 0.1786, val_mse: 14.4919, lr: 0.000078, 3.8s
2024-10-30 18:54:27 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [13/50] train_loss: 0.0928, val_loss: 0.0429, val_mse: 6.6974, lr: 0.000076, 3.6s
2024-10-30 18:54:31 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [14/50] train_loss: 0.1106, val_loss: 0.0827, val_mse: 12.4698, lr: 0.000074, 3.0s
2024-10-30 18:54:35 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [15/50] train_loss: 0.0779, val_loss: 0.0620, val_mse: 9.1849, lr: 0.000072, 3.7s
2024-10-30 18:54:38 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [16/50] train_loss: 0.0900, val_loss: 0.0892, val_mse: 13.9669, lr: 0.000070, 3.5s
2024-10-30 18:54:42 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [17/50] train_loss: 0.0746, val_loss: 0.0793, val_mse: 11.9350, lr: 0.000068, 3.8s
2024-10-30 18:54:46 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [18/50] train_loss: 0.0821, val_loss: 0.0622, val_mse: 10.1121, lr: 0.000066, 3.6s
2024-10-30 18:54:50 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [19/50] train_loss: 0.0862, val_loss: 0.0777, val_mse: 10.3746, lr: 0.000064, 3.8s
2024-10-30 18:54:50 | unimol_tools/utils/metrics.py | 234 | WARNING | Uni-Mol Tools | Early stopping at epoch: 19
2024-10-30 18:54:50 | unimol_tools/tasks/trainer.py | 300 | INFO | Uni-Mol Tools | load model success!
2024-10-30 18:54:50 | unimol_tools/models/nnmodel.py | 168 | INFO | Uni-Mol Tools | fold 1, result {'mse': 6.6973553, 'mae': 1.8395964, 'pearsonr': 0.9778024754189085, 'spearmanr': 0.9256978183173553, 'r2': 0.9433610444628509}
2024-10-30 18:54:50 | unimol_tools/models/unimol.py | 120 | INFO | Uni-Mol Tools | Loading pretrained weights from /opt/conda/lib/python3.8/site-packages/unimol_tools/weights/mol_pre_all_h_220816.pt
2024-10-30 18:54:53 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [1/50] train_loss: 0.9835, val_loss: 1.8115, val_mse: 143.6927, lr: 0.000067, 2.9s
2024-10-30 18:54:57 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [2/50] train_loss: 0.7318, val_loss: 1.9021, val_mse: 203.5993, lr: 0.000099, 3.4s
2024-10-30 18:55:01 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [3/50] train_loss: 0.3949, val_loss: 0.5701, val_mse: 64.8354, lr: 0.000097, 3.8s
2024-10-30 18:55:05 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [4/50] train_loss: 0.2270, val_loss: 0.5450, val_mse: 85.0188, lr: 0.000095, 3.2s
2024-10-30 18:55:09 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [5/50] train_loss: 0.1995, val_loss: 0.6744, val_mse: 71.4981, lr: 0.000093, 3.8s
2024-10-30 18:55:12 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [6/50] train_loss: 0.1956, val_loss: 0.4406, val_mse: 63.2957, lr: 0.000091, 3.6s
2024-10-30 18:55:16 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [7/50] train_loss: 0.1722, val_loss: 0.4095, val_mse: 67.0647, lr: 0.000089, 3.0s
2024-10-30 18:55:20 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [8/50] train_loss: 0.1746, val_loss: 0.7468, val_mse: 109.2743, lr: 0.000087, 3.8s
2024-10-30 18:55:23 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [9/50] train_loss: 0.1710, val_loss: 0.5659, val_mse: 94.1846, lr: 0.000085, 3.6s
2024-10-30 18:55:27 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [10/50] train_loss: 0.1688, val_loss: 0.7746, val_mse: 91.8199, lr: 0.000082, 3.7s
2024-10-30 18:55:31 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [11/50] train_loss: 0.0936, val_loss: 0.3859, val_mse: 61.0519, lr: 0.000080, 3.6s
2024-10-30 18:55:34 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [12/50] train_loss: 0.1667, val_loss: 0.4498, val_mse: 57.2472, lr: 0.000078, 2.7s
2024-10-30 18:55:38 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [13/50] train_loss: 0.1029, val_loss: 0.3451, val_mse: 49.2831, lr: 0.000076, 3.2s
2024-10-30 18:55:42 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [14/50] train_loss: 0.1205, val_loss: 0.3221, val_mse: 53.8555, lr: 0.000074, 3.1s
2024-10-30 18:55:45 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [15/50] train_loss: 0.0900, val_loss: 0.3122, val_mse: 45.2409, lr: 0.000072, 3.8s
2024-10-30 18:55:49 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [16/50] train_loss: 0.0885, val_loss: 0.3731, val_mse: 55.0695, lr: 0.000070, 3.0s
2024-10-30 18:55:53 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [17/50] train_loss: 0.0703, val_loss: 0.3027, val_mse: 49.0239, lr: 0.000068, 3.7s
2024-10-30 18:55:56 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [18/50] train_loss: 0.0661, val_loss: 0.2141, val_mse: 35.3969, lr: 0.000066, 3.6s
2024-10-30 18:56:00 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [19/50] train_loss: 0.0671, val_loss: 0.2914, val_mse: 45.1734, lr: 0.000064, 2.9s
2024-10-30 18:56:04 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [20/50] train_loss: 0.0992, val_loss: 0.3765, val_mse: 61.0506, lr: 0.000062, 3.7s
2024-10-30 18:56:07 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [21/50] train_loss: 0.0824, val_loss: 0.3653, val_mse: 55.4449, lr: 0.000060, 3.8s
2024-10-30 18:56:11 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [22/50] train_loss: 0.0734, val_loss: 0.3564, val_mse: 54.5551, lr: 0.000058, 3.6s
2024-10-30 18:56:15 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [23/50] train_loss: 0.0522, val_loss: 0.2424, val_mse: 37.1741, lr: 0.000056, 3.4s
2024-10-30 18:56:18 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [24/50] train_loss: 0.0642, val_loss: 0.3501, val_mse: 52.2948, lr: 0.000054, 3.5s
2024-10-30 18:56:18 | unimol_tools/utils/metrics.py | 234 | WARNING | Uni-Mol Tools | Early stopping at epoch: 24
2024-10-30 18:56:18 | unimol_tools/tasks/trainer.py | 300 | INFO | Uni-Mol Tools | load model success!
2024-10-30 18:56:18 | unimol_tools/models/nnmodel.py | 168 | INFO | Uni-Mol Tools | fold 2, result {'mse': 35.39689, 'mae': 2.1902883, 'pearsonr': 0.9049354599510534, 'spearmanr': 0.9653358665910773, 'r2': 0.8100617887132153}
2024-10-30 18:56:19 | unimol_tools/models/unimol.py | 120 | INFO | Uni-Mol Tools | Loading pretrained weights from /opt/conda/lib/python3.8/site-packages/unimol_tools/weights/mol_pre_all_h_220816.pt
2024-10-30 18:56:22 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [1/50] train_loss: 1.1102, val_loss: 1.0928, val_mse: 169.5346, lr: 0.000067, 3.1s
2024-10-30 18:56:26 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [2/50] train_loss: 0.7322, val_loss: 0.5821, val_mse: 75.8321, lr: 0.000099, 4.1s
2024-10-30 18:56:30 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [3/50] train_loss: 0.4103, val_loss: 0.2882, val_mse: 24.1046, lr: 0.000097, 2.6s
2024-10-30 18:56:34 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [4/50] train_loss: 0.3610, val_loss: 0.4471, val_mse: 62.5234, lr: 0.000095, 3.4s
2024-10-30 18:56:38 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [5/50] train_loss: 0.3122, val_loss: 0.0974, val_mse: 16.5930, lr: 0.000093, 4.0s
2024-10-30 18:56:41 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [6/50] train_loss: 0.2876, val_loss: 0.1348, val_mse: 18.3795, lr: 0.000091, 3.1s
2024-10-30 18:56:45 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [7/50] train_loss: 0.1551, val_loss: 0.1095, val_mse: 17.6756, lr: 0.000089, 3.8s
2024-10-30 18:56:49 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [8/50] train_loss: 0.1825, val_loss: 0.0912, val_mse: 12.6152, lr: 0.000087, 3.7s
2024-10-30 18:56:52 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [9/50] train_loss: 0.1239, val_loss: 0.1088, val_mse: 12.3743, lr: 0.000085, 2.9s
2024-10-30 18:56:56 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [10/50] train_loss: 0.1348, val_loss: 0.1088, val_mse: 18.1514, lr: 0.000082, 3.3s
2024-10-30 18:57:00 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [11/50] train_loss: 0.1108, val_loss: 0.0867, val_mse: 14.5609, lr: 0.000080, 4.0s
2024-10-30 18:57:04 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [12/50] train_loss: 0.0990, val_loss: 0.0689, val_mse: 11.2599, lr: 0.000078, 3.8s
2024-10-30 18:57:07 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [13/50] train_loss: 0.1191, val_loss: 0.0660, val_mse: 10.8328, lr: 0.000076, 2.9s
2024-10-30 18:57:11 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [14/50] train_loss: 0.1102, val_loss: 0.0655, val_mse: 8.6121, lr: 0.000074, 3.1s
2024-10-30 18:57:15 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [15/50] train_loss: 0.0885, val_loss: 0.0369, val_mse: 5.3566, lr: 0.000072, 2.9s
2024-10-30 18:57:18 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [16/50] train_loss: 0.0731, val_loss: 0.0671, val_mse: 5.3301, lr: 0.000070, 3.2s
2024-10-30 18:57:22 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [17/50] train_loss: 0.0827, val_loss: 0.0428, val_mse: 4.8061, lr: 0.000068, 3.1s
2024-10-30 18:57:26 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [18/50] train_loss: 0.1269, val_loss: 0.0520, val_mse: 5.7840, lr: 0.000066, 3.3s
2024-10-30 18:57:30 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [19/50] train_loss: 0.0665, val_loss: 0.1237, val_mse: 9.5863, lr: 0.000064, 3.8s
2024-10-30 18:57:34 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [20/50] train_loss: 0.0646, val_loss: 0.0681, val_mse: 11.3682, lr: 0.000062, 3.7s
2024-10-30 18:57:37 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [21/50] train_loss: 0.0614, val_loss: 0.0397, val_mse: 5.8202, lr: 0.000060, 3.9s
2024-10-30 18:57:41 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [22/50] train_loss: 0.0479, val_loss: 0.0772, val_mse: 12.5656, lr: 0.000058, 3.6s
2024-10-30 18:57:45 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [23/50] train_loss: 0.0527, val_loss: 0.0369, val_mse: 5.8652, lr: 0.000056, 3.7s
2024-10-30 18:57:45 | unimol_tools/utils/metrics.py | 234 | WARNING | Uni-Mol Tools | Early stopping at epoch: 23
2024-10-30 18:57:45 | unimol_tools/tasks/trainer.py | 300 | INFO | Uni-Mol Tools | load model success!
2024-10-30 18:57:45 | unimol_tools/models/nnmodel.py | 168 | INFO | Uni-Mol Tools | fold 3, result {'mse': 4.8060517, 'mae': 1.4244763, 'pearsonr': 0.9723111374896243, 'spearmanr': 0.9507378636770714, 'r2': 0.9439251424712938}
2024-10-30 18:57:45 | unimol_tools/models/unimol.py | 120 | INFO | Uni-Mol Tools | Loading pretrained weights from /opt/conda/lib/python3.8/site-packages/unimol_tools/weights/mol_pre_all_h_220816.pt
2024-10-30 18:57:49 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [1/50] train_loss: 1.0867, val_loss: 0.4868, val_mse: 77.0654, lr: 0.000067, 3.0s
2024-10-30 18:57:53 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [2/50] train_loss: 0.5930, val_loss: 0.2804, val_mse: 47.0637, lr: 0.000099, 3.9s
2024-10-30 18:57:57 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [3/50] train_loss: 0.4688, val_loss: 0.4465, val_mse: 63.6389, lr: 0.000097, 3.0s
2024-10-30 18:58:01 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [4/50] train_loss: 0.4450, val_loss: 0.2010, val_mse: 33.6601, lr: 0.000095, 4.0s
2024-10-30 18:58:04 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [5/50] train_loss: 0.3592, val_loss: 0.0907, val_mse: 15.6597, lr: 0.000093, 3.2s
2024-10-30 18:58:08 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [6/50] train_loss: 0.2285, val_loss: 0.2289, val_mse: 39.2793, lr: 0.000091, 3.3s
2024-10-30 18:58:12 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [7/50] train_loss: 0.1830, val_loss: 0.1165, val_mse: 20.1408, lr: 0.000089, 4.0s
2024-10-30 18:58:16 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [8/50] train_loss: 0.1834, val_loss: 0.1995, val_mse: 31.4781, lr: 0.000087, 3.7s
2024-10-30 18:58:20 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [9/50] train_loss: 0.1941, val_loss: 0.0849, val_mse: 14.5418, lr: 0.000085, 3.7s
2024-10-30 18:58:23 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [10/50] train_loss: 0.1608, val_loss: 0.0916, val_mse: 15.8454, lr: 0.000082, 3.1s
2024-10-30 18:58:27 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [11/50] train_loss: 0.1031, val_loss: 0.0851, val_mse: 14.2308, lr: 0.000080, 3.9s
2024-10-30 18:58:31 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [12/50] train_loss: 0.0967, val_loss: 0.1789, val_mse: 30.6435, lr: 0.000078, 3.1s
2024-10-30 18:58:35 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [13/50] train_loss: 0.1327, val_loss: 0.0779, val_mse: 13.3681, lr: 0.000076, 3.8s
2024-10-30 18:58:39 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [14/50] train_loss: 0.1012, val_loss: 0.0784, val_mse: 12.6557, lr: 0.000074, 3.2s
2024-10-30 18:58:43 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [15/50] train_loss: 0.0883, val_loss: 0.0820, val_mse: 14.1586, lr: 0.000072, 3.2s
2024-10-30 18:58:46 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [16/50] train_loss: 0.0831, val_loss: 0.0750, val_mse: 12.9639, lr: 0.000070, 3.8s
2024-10-30 18:58:50 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [17/50] train_loss: 0.0784, val_loss: 0.0893, val_mse: 15.4357, lr: 0.000068, 3.6s
2024-10-30 18:58:54 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [18/50] train_loss: 0.0966, val_loss: 0.1489, val_mse: 25.0170, lr: 0.000066, 3.9s
2024-10-30 18:58:57 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [19/50] train_loss: 0.0875, val_loss: 0.0713, val_mse: 11.8988, lr: 0.000064, 3.5s
2024-10-30 18:59:01 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [20/50] train_loss: 0.0640, val_loss: 0.0701, val_mse: 11.8955, lr: 0.000062, 3.0s
2024-10-30 18:59:05 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [21/50] train_loss: 0.0618, val_loss: 0.0617, val_mse: 10.3834, lr: 0.000060, 3.1s
2024-10-30 18:59:09 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [22/50] train_loss: 0.0661, val_loss: 0.0701, val_mse: 12.1243, lr: 0.000058, 3.2s
2024-10-30 18:59:12 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [23/50] train_loss: 0.1084, val_loss: 0.0835, val_mse: 14.0836, lr: 0.000056, 3.7s
2024-10-30 18:59:16 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [24/50] train_loss: 0.0817, val_loss: 0.0698, val_mse: 12.0134, lr: 0.000054, 3.8s
2024-10-30 18:59:20 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [25/50] train_loss: 0.0541, val_loss: 0.0799, val_mse: 13.8078, lr: 0.000052, 3.7s
2024-10-30 18:59:24 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [26/50] train_loss: 0.0595, val_loss: 0.0728, val_mse: 12.5811, lr: 0.000049, 3.8s
2024-10-30 18:59:28 | unimol_tools/tasks/trainer.py | 210 | INFO | Uni-Mol Tools | Epoch [27/50] train_loss: 0.0445, val_loss: 0.0722, val_mse: 12.3370, lr: 0.000047, 3.9s
2024-10-30 18:59:28 | unimol_tools/utils/metrics.py | 234 | WARNING | Uni-Mol Tools | Early stopping at epoch: 27
2024-10-30 18:59:28 | unimol_tools/tasks/trainer.py | 300 | INFO | Uni-Mol Tools | load model success!
2024-10-30 18:59:28 | unimol_tools/models/nnmodel.py | 168 | INFO | Uni-Mol Tools | fold 4, result {'mse': 10.38339, 'mae': 1.5399003, 'pearsonr': 0.9623616014328419, 'spearmanr': 0.9253816719408297, 'r2': 0.9153632278229847}
2024-10-30 18:59:28 | unimol_tools/models/nnmodel.py | 183 | INFO | Uni-Mol Tools | Uni-Mol metrics score: 
{'mse': 14.757585689501116, 'mae': 1.9814213305232533, 'pearsonr': 0.9498405407741434, 'spearmanr': 0.9300546922647814, 'r2': 0.9014728617730194}
2024-10-30 18:59:28 | unimol_tools/models/nnmodel.py | 184 | INFO | Uni-Mol Tools | Uni-Mol & Metric result saved!
代码
文本

Step4: 微调参数(略)

  • 在之前的案例中我们已经介绍了如何调参,在这里就不再赘述,详见 BBBP案例:Open In Bohrium
  • 作为练习,请尝试添加代码块来进行超参微调
代码
文本

Step5: 读入分子构象用于预测介电常数

测试集的数据类型与训练集一致,需要原子类型序列+原子坐标序列

代码
文本
[8]
# 载入构象预测
import numpy as np
f = open('./data/eps_data_test.pkl', 'rb') # 'rb' for reading binary; can be omitted
eps_test = pickle.load(f) # 以字典“dict”载入pkl文件
f.close()

custom_data = {
'atoms':eps_test["atoms"].to_list(),
'coordinates':eps_test["coord"].to_list(),
}

clf = MolPredict(load_model = './data/eps_train')
predict = clf.predict(custom_data)
2024-10-30 19:00:25 | unimol_tools/models/unimol.py | 120 | INFO | Uni-Mol Tools | Loading pretrained weights from /opt/conda/lib/python3.8/site-packages/unimol_tools/weights/mol_pre_all_h_220816.pt
2024-10-30 19:00:25 | unimol_tools/models/nnmodel.py | 206 | INFO | Uni-Mol Tools | start predict NNModel:unimolv1
2024-10-30 19:00:25 | unimol_tools/tasks/trainer.py | 300 | INFO | Uni-Mol Tools | load model success!
2024-10-30 19:00:26 | unimol_tools/tasks/trainer.py | 300 | INFO | Uni-Mol Tools | load model success!
2024-10-30 19:00:26 | unimol_tools/tasks/trainer.py | 300 | INFO | Uni-Mol Tools | load model success!
2024-10-30 19:00:27 | unimol_tools/tasks/trainer.py | 300 | INFO | Uni-Mol Tools | load model success!
2024-10-30 19:00:27 | unimol_tools/tasks/trainer.py | 300 | INFO | Uni-Mol Tools | load model success!
                                                  
代码
文本

测试集的实验介电常数实验数据在文件https://dp-public.oss-cn-beijing.aliyuncs.com/community/courses/eps_test.csv

代码
文本
[9]
!wget -P ./data/ https://dp-public.oss-cn-beijing.aliyuncs.com/community/courses/eps_test.csv
--2024-10-30 19:00:32--  https://dp-public.oss-cn-beijing.aliyuncs.com/community/courses/eps_test.csv
Resolving ga.dp.tech (ga.dp.tech)... 10.255.254.7, 10.255.254.37, 10.255.254.18
Connecting to ga.dp.tech (ga.dp.tech)|10.255.254.7|:8118... connected.
Proxy request sent, awaiting response... 200 OK
Length: 2190 (2.1K) [text/csv]
Saving to: ‘./data/eps_test.csv’

eps_test.csv        100%[===================>]   2.14K  --.-KB/s    in 0s      

2024-10-30 19:00:32 (401 MB/s) - ‘./data/eps_test.csv’ saved [2190/2190]

代码
文本
[10]
# 通过画出实验值和预测值来可视化我们的模型训练结果,比对测试集的实验值和预测值
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

test_set = pd.read_csv("./data/eps_test.csv",header='infer') #读取实验数据文件
test_eps = test_set["eps"].to_numpy() #提取eps值

# 计算预测值和实验值的范围,用于设定图像的坐标轴范围
xmin = min(predict.flatten().min(), test_eps.min())
xmax = max(predict.flatten().max(), test_eps.max())
ymin = xmin
ymax = xmax

plt.figure(figsize=(8, 8)) # 设置图像大小
plt.style.use('seaborn-darkgrid') # 改变图的样式
plt.xlim(xmin, xmax) # 设置x轴的范围
plt.ylim(ymin, ymax) # 设置y轴的范围
plt.xlabel('Predicted $\epsilon$', fontsize=14) #X轴标签为eps预测值,设置字体大小
plt.ylabel('Experimental $\epsilon$', fontsize=14) #Y轴标签为eps实验值,设置字体大小
plt.title('Experimental vs Predicted $\epsilon$', fontsize=16) # 添加标题,设置字体大小
plt.scatter(predict.flatten(),test_eps, color='blue', alpha=0.6) # 以散点图的形式画出实验值和模型预测值,设置颜色、透明度和图例
x = np.linspace(*plt.xlim())
plt.plot(x, x, color='red', linestyle='--', linewidth=2) # 画出y=x的直线,如果数据点靠近这条直线,说明预测值的误差越小,设置颜色、线型、线宽和图例

plt.show()

代码
文本
  • 在预测模型从未见过的分子的介电常数值时, 绝大部分数据有较好的预测结果.
  • 有三个异常值(outlier, 特征值与训练数据中的值显著不同)出现, 说明我们的模型对于预测这三个分子的能力较差.
    如果测试数据中包含异常结构,模型可能会对这些点做出较差的预测。对于这种情况,我们可能需要对测试数据进行进一步的清洗和预处理.
代码
文本
Uni-Mol
Deep Learning
Uni-MolDeep Learning
点个赞吧
本文被以下合集收录
QSAR
微信用户xzbQ
更新于 2024-11-02
1 篇0 人关注
推荐阅读
公开
Uni-Mol性质预测实战-回归任务-电解液分子的介电常数
TutorialMachine Learning中文notebookUni-MolQSAR
TutorialMachine Learning中文notebookUni-MolQSAR
zhengh@dp.tech
发布于 2023-06-12
5 赞22 转存文件
公开
Uni-Mol性质预测实战-回归任务-有机/电解液分子的熔点预测
Uni-MolDeep Learning中文
Uni-MolDeep Learning中文
Letian
更新于 2024-10-31
3 赞5 转存文件
{/**/}