![](https://bohrium.oss-cn-zhangjiakou.aliyuncs.com/article/27834/08523e7b6f484d2d89b7a5eb272e4566/5f5e67f6-d548-4834-a70f-155aea7a93c9.png?x-oss-process=image/resize,w_100,m_lfit)
![](https://cdn1.deepmd.net/bohrium/web/static/images/level-v2-1.png?x-oss-process=image/resize,w_50,m_lfit)
©️ Copyright 2024 @ Authors
日期:2024-06-17
共享协议:本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。
快速开始:点击上方的 开始连接 按钮,选择 deepmd-kit:2024Q1-d23cf3e 镜像及 c2_m4_cpu 节点配置,稍等片刻即可运行。
通过数值模拟方法研究了聚乙烯(PE,聚合度30,单链体系)材料在高温无氧环境下的热解降解反应机理。采用ReaxFF力场,通过LAMMPS平台在无氧环境下模拟了 PE的热解降解反应。使用PCA技术对模拟数据进行降维分析,识别出关键物质在热解降解过程中的变化趋势和重要中间体。同时聚类分析显示部分中间产物之间的相似性,为进一步的反应机制研究提供理论支持。
Principal Component Analysis(PCA)
PCA是一种常用的线性降维技术,其主要思想是通过线性变换将原始特征转换成一组新的正交特征,称为主成分,其中每个主成分都是原始特征的线性组合。PCA的步骤包括计算数据的协方差矩阵、计算特征值和特征向量、选择主成分并投影数据。在化学物质的PCA分析中,这些主成分代表了数据中的主要方差方向,即数据中最显著的变化方向。比如在本实验对大量化学物质的处理中,主成分为所有物质按其特征向量绝对值的大小排列形成的主成分。
我们已经为您准备了运行 LAMMPS 计算所需的lmp_control文件,param.qeq文件,ffield.reax.cho力场文件,data.CHO文件,input.files文件,初始数据,并将其放置在文件夹 pca 中。你可以在左侧点击数据集查看相应文件:
当前路径为:/personal/bohr/lammps-qepa/v4/pca
让我们来查看下载的 pca文件夹:
/personal/bohr/lammps-qepa/v4/pca/ ├── bond.CHO ├── data.CHO ├── ffield.reax.cho ├── input.pe.files ├── lmp_control ├── param.qeq ├── pe.csv └── pe.txt 0 directories, 8 files
在 lammps-qepa 文件夹下有lmp_control文件,param.qeq文件,ffield.reax.cho力场文件,data.CHO文件,input.files文件,清洗前的pe.txt文件,清洗后的pe.csv文件
其中lmp_control 是 LAMMPS(Large-scale Atomic/Molecular Massively Parallel Simulator)计算中用于控制模拟参数和设置的输入文件。它通常用于与 ReaxFF势能函数文件配合使用,以设置模拟的各种参数,如温度,压力,时间步长,模拟时间等。本实验的该文件是通过LAMMPS官方测试用例获得。 param.qeq 是用于描述与电荷平衡(QEq,Charge Equilibration)算法相关的参数。其中有计算体系中的原子电荷分布,以更准确地模拟分子间和分子内的相互作用。这些参数可以帮助计算模拟过程中电荷转移和电荷分布,对模拟涉及复杂反应机制和电荷转移过程的体系例如氧化还原反应、催化反应等特别重要。本实验的该文件亦是通过LAMMPS官方测试用例获得。 ffield.reax.CHO是ReaxFF势能函数,它可以描述C/H/O/N等多种元素之间的相互作用。ReaxFF中包含超过用于描述键能、角势能、电子转移、三体相互作用等多种相互作用的30个参数。该文件已经被广泛应用于热化学反应和界面反应等领域。本实验的该文件是通过文献搜集进行选择获得。 input.files则是本实验的核心,input.files 通常指的是一个包含运行 LAMMPS模拟所需的多个输入文件列表的文件。在某些模拟框架或脚本中,特别是在一些自动化工具或高级使用场景中,这个文件用来统一管理和组织多个需要一起使用的输入文件。本实验将通过修改此文件来获得需要运算的参数。
我们需要对该文件夹中的pe.csv进行分析。
2. 数据清洗
在对pe.txt数据处理之前,需要进行数据处理,将数据规整化,需要进行缺失值补充等,同时需要建立数据字典。
首先,在species.out中(本实验为了方便,统一命名为pe.txt),我们会发现出现了C 和H 标反的错误,可以通过如下代码进行互换
接下来进行数据清洗,同时建立数据字典。
An error occurred: list index out of range Timestep No_Moles No_Specs H242C120 H189C94 H4C2 H49C24 H92C45 H89C45 \ 0 10000 1 1 1 0 0 0 0 0 1 20000 3 3 0 1 1 1 0 0 2 30000 7 6 0 0 2 0 1 1 3 40000 12 8 0 0 4 0 0 0 4 50000 15 9 0 0 7 0 0 0 .. ... ... ... ... ... ... ... ... ... 195 1960000 69 12 0 0 28 0 0 0 196 1970000 69 12 0 0 28 0 0 0 197 1980000 69 12 0 0 28 0 0 0 198 1990000 68 12 0 0 28 0 0 0 199 2000000 68 12 0 0 28 0 0 0 H8C4 ... H14C8 H3C2 H8C5 H4C H2C H6C4 H5C4 H7C5 H6C5 H5C 0 0 ... 0 0 0 0 0 0 0 0 0 0 1 0 ... 0 0 0 0 0 0 0 0 0 0 2 1 ... 0 0 0 0 0 0 0 0 0 0 3 0 ... 0 0 0 0 0 0 0 0 0 0 4 0 ... 0 0 0 0 0 0 0 0 0 0 .. ... ... ... ... ... .. .. ... ... ... ... .. 195 1 ... 0 5 0 3 0 2 0 0 1 0 196 1 ... 0 5 0 3 0 2 0 0 1 0 197 1 ... 0 5 0 3 0 2 0 0 1 0 198 1 ... 0 5 0 3 0 2 0 0 1 0 199 1 ... 0 5 0 3 0 2 0 0 1 0 [200 rows x 56 columns]
由此,我们得到了PE的规整数据文件,pe.csv,可以进入PCA环节。
3. 数据分析
接下来,我们会对pe.csv进行PCA降维和Kmeans聚类分析。
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Requirement already satisfied: scikit-learn in /opt/mamba/lib/python3.10/site-packages (1.5.0) Requirement already satisfied: numpy>=1.19.5 in /opt/mamba/lib/python3.10/site-packages (from scikit-learn) (1.26.4) Requirement already satisfied: scipy>=1.6.0 in /opt/mamba/lib/python3.10/site-packages (from scikit-learn) (1.11.4) Requirement already satisfied: joblib>=1.2.0 in /opt/mamba/lib/python3.10/site-packages (from scikit-learn) (1.4.2) Requirement already satisfied: threadpoolctl>=3.1.0 in /opt/mamba/lib/python3.10/site-packages (from scikit-learn) (3.5.0) WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Requirement already satisfied: matplotlib in /opt/mamba/lib/python3.10/site-packages (3.9.0) Requirement already satisfied: contourpy>=1.0.1 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (1.2.1) Requirement already satisfied: cycler>=0.10 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (0.12.1) Requirement already satisfied: fonttools>=4.22.0 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (4.53.0) Requirement already satisfied: kiwisolver>=1.3.1 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (1.4.5) Requirement already satisfied: numpy>=1.23 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (1.26.4) Requirement already satisfied: packaging>=20.0 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (23.2) Requirement already satisfied: pillow>=8 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (10.3.0) Requirement already satisfied: pyparsing>=2.3.1 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (3.1.2) Requirement already satisfied: python-dateutil>=2.7 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (2.8.2) Requirement already satisfied: six>=1.5 in /opt/mamba/lib/python3.10/site-packages (from python-dateutil>=2.7->matplotlib) (1.16.0) WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv Columns in the DataFrame: Index(['Timestep', 'No_Moles', 'No_Specs', 'H242C120', 'H189C94', 'H4C2', 'H49C24', 'H92C45', 'H89C45', 'H8C4', 'H44C22', 'H', 'H19C9', 'H73C36', 'H5C3', 'H16C8', 'H67C34', 'H15C7', 'H55C28', 'H5C2', 'H11C5', 'H43C21', 'H13C7', 'H7C3', 'H69C34', 'H10C5', 'H59C29', 'H6C3', 'H38C19', 'H29C15', 'H9C4', 'H3C', 'H6C2', 'H25C13', 'H34C17', 'H26C13', 'H9C5', 'H30C15', 'H11C6', 'H18C9', 'H10C4', 'H7C4', 'H4C3', 'H2C2', 'H15C8', 'H2', 'H14C8', 'H3C2', 'H8C5', 'H4C', 'H2C', 'H6C4', 'H5C4', 'H7C5', 'H6C5', 'H5C'], dtype='object') Column Time does not exist in DataFrame Dropping column: Timestep Dropping column: No_Moles Dropping column: No_Specs Columns after dropping specified columns: Index(['H242C120', 'H189C94', 'H4C2', 'H49C24', 'H92C45', 'H89C45', 'H8C4', 'H44C22', 'H', 'H19C9', 'H73C36', 'H5C3', 'H16C8', 'H67C34', 'H15C7', 'H55C28', 'H5C2', 'H11C5', 'H43C21', 'H13C7', 'H7C3', 'H69C34', 'H10C5', 'H59C29', 'H6C3', 'H38C19', 'H29C15', 'H9C4', 'H3C', 'H6C2', 'H25C13', 'H34C17', 'H26C13', 'H9C5', 'H30C15', 'H11C6', 'H18C9', 'H10C4', 'H7C4', 'H4C3', 'H2C2', 'H15C8', 'H2', 'H14C8', 'H3C2', 'H8C5', 'H4C', 'H2C', 'H6C4', 'H5C4', 'H7C5', 'H6C5', 'H5C'], dtype='object') Principal Component 1: [-0.01360365 -0.01527127 0.09468784 -0.01527127 -0.01926378 -0.01926378 -0.14348788 -0.05876032 0.053678 -0.02499621 -0.04752367 -0.08011324 -0.0558229 -0.02499621 -0.04710345 -0.02726246 -0.11497388 -0.02964005 -0.04150012 -0.08514119 -0.06644979 -0.02890233 -0.12139768 -0.03678431 -0.22408498 -0.07204899 -0.10996609 -0.19883239 0.11574646 -0.06417193 -0.05227374 -0.03768472 -0.10140824 -0.20337462 -0.1147425 -0.11645706 -0.04013328 -0.22498154 -0.072616 0.30427574 0.2691073 -0.05556565 0.32071994 -0.16348593 0.2930887 -0.09559369 0.30186379 0.13926064 0.31351538 0.06627044 0.02215976 0.17463813 0.02869012] Principal Component 2: [-4.54994720e-02 -5.44399225e-02 3.24256036e-01 -5.44399225e-02 -8.05108092e-02 -8.05108092e-02 2.56692841e-01 -2.87389361e-01 -1.16099232e-01 -1.44353286e-01 -2.46550290e-01 1.90659109e-01 -2.79928043e-01 -1.44353286e-01 -9.96689139e-02 -1.24441570e-01 1.64646010e-01 -1.56091436e-01 -2.03327549e-01 -2.41970321e-01 -2.68370885e-02 -1.30733746e-01 -1.51229454e-01 -7.46464272e-02 1.71878552e-01 -1.45829373e-01 -8.63830192e-02 -4.76818695e-02 2.77168873e-01 -1.29829722e-01 -1.08953560e-01 -2.98137197e-02 -6.81109683e-02 5.55139025e-02 -5.67168719e-02 -2.12631362e-02 -3.01305610e-02 1.68727101e-01 1.11645593e-01 -2.03389274e-02 4.32810343e-03 3.02055407e-02 8.84757424e-04 1.56591547e-01 -2.00546033e-02 1.45801072e-01 -3.72144237e-02 1.90582323e-02 -3.21480292e-02 -3.11333977e-04 -8.69317284e-03 -3.27043738e-02 -7.42709255e-03] Chemical Formulas for Selected Substances: Principal Component 1 : ['H242C120', 'H189C94', 'H4C2', 'H49C24', 'H92C45', 'H89C45', 'H8C4', 'H44C22', 'H', 'H19C9', 'H73C36', 'H5C3', 'H16C8', 'H67C34', 'H15C7', 'H55C28', 'H5C2', 'H11C5', 'H43C21', 'H13C7', 'H7C3', 'H69C34', 'H10C5', 'H59C29', 'H6C3', 'H38C19', 'H29C15', 'H9C4', 'H3C', 'H6C2', 'H25C13', 'H34C17', 'H26C13', 'H9C5', 'H30C15', 'H11C6', 'H18C9', 'H10C4', 'H7C4', 'H4C3', 'H2C2', 'H15C8', 'H2', 'H14C8', 'H3C2', 'H8C5', 'H4C', 'H2C', 'H6C4', 'H5C4', 'H7C5', 'H6C5', 'H5C'] Principal Component 2 : ['H242C120', 'H189C94', 'H4C2', 'H49C24', 'H92C45', 'H89C45', 'H8C4', 'H44C22', 'H', 'H19C9', 'H73C36', 'H5C3', 'H16C8', 'H67C34', 'H15C7', 'H55C28', 'H5C2', 'H11C5', 'H43C21', 'H13C7', 'H7C3', 'H69C34', 'H10C5', 'H59C29', 'H6C3', 'H38C19', 'H29C15', 'H9C4', 'H3C', 'H6C2', 'H25C13', 'H34C17', 'H26C13', 'H9C5', 'H30C15', 'H11C6', 'H18C9', 'H10C4', 'H7C4', 'H4C3', 'H2C2', 'H15C8', 'H2', 'H14C8', 'H3C2', 'H8C5', 'H4C', 'H2C', 'H6C4', 'H5C4', 'H7C5', 'H6C5', 'H5C'] Explained Variance Ratio: [0.17819562 0.12522012 0.09090798 0.06825519 0.05077275 0.04599622 0.04262152 0.04199834 0.03961585 0.03526824 0.02835994 0.02737296 0.02371402 0.02214529 0.02105633 0.02000413 0.01865851 0.01794748 0.01469968 0.01314307 0.012208 0.01014002 0.00922751] Cluster 1 top 5 substances: Cluster 2 top 5 substances: Cluster 3 top 5 substances: H49C24 Cluster 4 top 5 substances: H6C2 H25C13 H34C17 H26C13 H9C5 Cluster 5 top 5 substances: H242C120 Cluster 6 top 5 substances: H89C45 H8C4
以上是本实验的完整代码,由此可以得到的图如下 同时代码内会输出一些内容,通过这个我们可以判断PC1和PC2主成分中各物质对应的特征向量,以及聚类分析得到的簇内的物质
代码的输出如下,首先是综合了聚类分析的PCA图像
其中标出了对实验影响最大的五种物质
接下来是聚类分析环节: 其中的红叉是聚类分析产生的簇,我们通过设置使他产生了6个簇.默认情况下,KMeans会随机选择初始聚类中心,这可能导致每次运行时聚类结果不同。本实验给出了随机种子,实验者可以自行修改后进行分析. 代码中可以看到一个簇中的不同物质分别是什么(以距离排序),如果没有代表为空簇.
接下来,我们需要研究图内PC1,PC2的物质浓度比例来帮助我们分析化学反应中物质的变化。这可以通过热加载矩阵来更直观的体现,生成热加载矩阵的代码如下
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Requirement already satisfied: seaborn in /opt/mamba/lib/python3.10/site-packages (0.13.2) Requirement already satisfied: numpy!=1.24.0,>=1.20 in /opt/mamba/lib/python3.10/site-packages (from seaborn) (1.26.4) Requirement already satisfied: pandas>=1.2 in /opt/mamba/lib/python3.10/site-packages (from seaborn) (2.1.4) Requirement already satisfied: matplotlib!=3.6.1,>=3.4 in /opt/mamba/lib/python3.10/site-packages (from seaborn) (3.9.0) Requirement already satisfied: contourpy>=1.0.1 in /opt/mamba/lib/python3.10/site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (1.2.1) Requirement already satisfied: cycler>=0.10 in /opt/mamba/lib/python3.10/site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (0.12.1) Requirement already satisfied: fonttools>=4.22.0 in /opt/mamba/lib/python3.10/site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (4.53.0) Requirement already satisfied: kiwisolver>=1.3.1 in /opt/mamba/lib/python3.10/site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (1.4.5) Requirement already satisfied: packaging>=20.0 in /opt/mamba/lib/python3.10/site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (23.2) Requirement already satisfied: pillow>=8 in /opt/mamba/lib/python3.10/site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (10.3.0) Requirement already satisfied: pyparsing>=2.3.1 in /opt/mamba/lib/python3.10/site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (3.1.2) Requirement already satisfied: python-dateutil>=2.7 in /opt/mamba/lib/python3.10/site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (2.8.2) Requirement already satisfied: pytz>=2020.1 in /opt/mamba/lib/python3.10/site-packages (from pandas>=1.2->seaborn) (2023.3.post1) Requirement already satisfied: tzdata>=2022.1 in /opt/mamba/lib/python3.10/site-packages (from pandas>=1.2->seaborn) (2023.3) Requirement already satisfied: six>=1.5 in /opt/mamba/lib/python3.10/site-packages (from python-dateutil>=2.7->matplotlib!=3.6.1,>=3.4->seaborn) (1.16.0) WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
最终得到的图像如下
想读懂热加载矩阵图,我们需要知道,图中的颜色深度代表特征向量的大小,蓝色代表载荷为负,红色代表载荷为正.我们可以通过如下几个切入点选择其中的物质进行后续分析
- PC1和PC2载荷都较大的物质
- PC1和PC2载荷正负相反的物质
- 仅在PC1或PC2载荷较大的物质
4. Uniform Manifold Approximation and Projection(UMAP)
在PCA分析中,每个主成分都是由原始特征的线性组合构成的,得到的主成分的排列顺序是根据它们所包含的方差大小来确定的,即方差较大的主成分排在前面,方差较小的主成分排在后面。这意味着排在前面的主成分包含了数据中最显著的变化信息,而排在后面的主成分包含的变化信息逐渐减少。所以在主成分PC1,PC2中,越靠前的化学物质代表其包含的变化信息多,在本项目中的具体体现为越靠前的物质在Origin中的时间-浓度图像的变化幅度越大。本研究选取的主要研究方法为PCA降维分析,辅以UMAP作为辅助研究方法,主要对PCA结果产生的特征向量中的化学物质进行分析,同时对存在建在关联的物质进行聚类分析,同时通过UMAP方法的结果与PCA是否符合来检验PCA过程的结果。
UMAP是一种非线性降维技术,旨在将高维数据映射到低维空间,同时保持数据点之间的局部距离关系。与PCA等线性降维方法不同,UMAP能够捕捉到数据中的非线性结构和复杂关系,因此在处理高维数据集时通常具有更好的效果。UMAP的基本思想是通过优化损失函数,将高维数据映射到低维流形空间,以最大程度地保留数据点之间的局部结构和全局结构。UMAP的计算过程包括局部距离计算、高维空间中的随机游走、优化损失函数等步骤。
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Requirement already satisfied: scikit-learn in /opt/mamba/lib/python3.10/site-packages (1.5.0) Requirement already satisfied: numpy>=1.19.5 in /opt/mamba/lib/python3.10/site-packages (from scikit-learn) (1.26.4) Requirement already satisfied: scipy>=1.6.0 in /opt/mamba/lib/python3.10/site-packages (from scikit-learn) (1.11.4) Requirement already satisfied: joblib>=1.2.0 in /opt/mamba/lib/python3.10/site-packages (from scikit-learn) (1.4.2) Requirement already satisfied: threadpoolctl>=3.1.0 in /opt/mamba/lib/python3.10/site-packages (from scikit-learn) (3.5.0) WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Requirement already satisfied: matplotlib in /opt/mamba/lib/python3.10/site-packages (3.9.0) Requirement already satisfied: umap-learn in /opt/mamba/lib/python3.10/site-packages (0.5.6) Requirement already satisfied: contourpy>=1.0.1 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (1.2.1) Requirement already satisfied: cycler>=0.10 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (0.12.1) Requirement already satisfied: fonttools>=4.22.0 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (4.53.0) Requirement already satisfied: kiwisolver>=1.3.1 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (1.4.5) Requirement already satisfied: numpy>=1.23 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (1.26.4) Requirement already satisfied: packaging>=20.0 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (23.2) Requirement already satisfied: pillow>=8 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (10.3.0) Requirement already satisfied: pyparsing>=2.3.1 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (3.1.2) Requirement already satisfied: python-dateutil>=2.7 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (2.8.2) Requirement already satisfied: scipy>=1.3.1 in /opt/mamba/lib/python3.10/site-packages (from umap-learn) (1.11.4) Requirement already satisfied: scikit-learn>=0.22 in /opt/mamba/lib/python3.10/site-packages (from umap-learn) (1.5.0) Requirement already satisfied: numba>=0.51.2 in /opt/mamba/lib/python3.10/site-packages (from umap-learn) (0.60.0) Requirement already satisfied: pynndescent>=0.5 in /opt/mamba/lib/python3.10/site-packages (from umap-learn) (0.5.12) Requirement already satisfied: tqdm in /opt/mamba/lib/python3.10/site-packages (from umap-learn) (4.64.1) Requirement already satisfied: llvmlite<0.44,>=0.43.0dev0 in /opt/mamba/lib/python3.10/site-packages (from numba>=0.51.2->umap-learn) (0.43.0) Requirement already satisfied: joblib>=0.11 in /opt/mamba/lib/python3.10/site-packages (from pynndescent>=0.5->umap-learn) (1.4.2) Requirement already satisfied: six>=1.5 in /opt/mamba/lib/python3.10/site-packages (from python-dateutil>=2.7->matplotlib) (1.16.0) Requirement already satisfied: threadpoolctl>=3.1.0 in /opt/mamba/lib/python3.10/site-packages (from scikit-learn>=0.22->umap-learn) (3.5.0) WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv Columns in the DataFrame: Index(['Timestep', 'No_Moles', 'No_Specs', 'H242C120', 'H189C94', 'H4C2', 'H49C24', 'H92C45', 'H89C45', 'H8C4', 'H44C22', 'H', 'H19C9', 'H73C36', 'H5C3', 'H16C8', 'H67C34', 'H15C7', 'H55C28', 'H5C2', 'H11C5', 'H43C21', 'H13C7', 'H7C3', 'H69C34', 'H10C5', 'H59C29', 'H6C3', 'H38C19', 'H29C15', 'H9C4', 'H3C', 'H6C2', 'H25C13', 'H34C17', 'H26C13', 'H9C5', 'H30C15', 'H11C6', 'H18C9', 'H10C4', 'H7C4', 'H4C3', 'H2C2', 'H15C8', 'H2', 'H14C8', 'H3C2', 'H8C5', 'H4C', 'H2C', 'H6C4', 'H5C4', 'H7C5', 'H6C5', 'H5C'], dtype='object') Column Time does not exist in DataFrame Dropping column: Timestep Dropping column: No_Moles Dropping column: No_Specs Columns after dropping specified columns: Index(['H242C120', 'H189C94', 'H4C2', 'H49C24', 'H92C45', 'H89C45', 'H8C4', 'H44C22', 'H', 'H19C9', 'H73C36', 'H5C3', 'H16C8', 'H67C34', 'H15C7', 'H55C28', 'H5C2', 'H11C5', 'H43C21', 'H13C7', 'H7C3', 'H69C34', 'H10C5', 'H59C29', 'H6C3', 'H38C19', 'H29C15', 'H9C4', 'H3C', 'H6C2', 'H25C13', 'H34C17', 'H26C13', 'H9C5', 'H30C15', 'H11C6', 'H18C9', 'H10C4', 'H7C4', 'H4C3', 'H2C2', 'H15C8', 'H2', 'H14C8', 'H3C2', 'H8C5', 'H4C', 'H2C', 'H6C4', 'H5C4', 'H7C5', 'H6C5', 'H5C'], dtype='object') /opt/mamba/lib/python3.10/site-packages/umap/umap_.py:1945: UserWarning: n_jobs value 1 overridden to 1 by setting random_state. Use no seed for parallelism. warn(f"n_jobs value {self.n_jobs} overridden to 1 by setting random_state. Use no seed for parallelism.")
![](https://cdn1.deepmd.net/static/img/d7d9741bda38a158-957c-4877-942f-4bf6f81fcc63.png?x-oss-process=image/resize,w_100,m_lfit)
![](https://cdn1.deepmd.net/bohrium/web/static/images/level-v2-1.png?x-oss-process=image/resize,w_50,m_lfit)
![](https://bohrium.oss-cn-zhangjiakou.aliyuncs.com/article/19853/dabc29e7838a482f83f6a5e307b34047/6b071a3e-d2f2-4c40-be53-513156ca9c97.png?x-oss-process=image/resize,w_100,m_lfit)
![](https://cdn1.deepmd.net/bohrium/web/static/images/level-v2-2.png?x-oss-process=image/resize,w_50,m_lfit)