Bohrium
robot
新建

空间站广场

论文
Notebooks
比赛
课程
Apps
我的主页
我的Notebooks
我的论文库
我的足迹

我的工作空间

任务
节点
文件
数据集
镜像
项目
数据库
公开
浅谈STEM图像|机器学习助力材料图像表征
Machine Learning
AI4S
STEM
Machine LearningAI4SSTEM
hongyanhui
发布于 2023-07-24
推荐镜像 :Basic Image:bohrium-notebook:2023-03-26
推荐机型 :c12_m46_1 * NVIDIA GPU B
赞 15
22
21
stem-sample(v1)

浅谈STEM图像|机器学习助力材料图像表征

©️ Copyright 2023 @ Authors
作者: 洪燕辉 📨
日期:2023-07-11
共享协议:本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。
快速开始:点击上方的 开始连接 按钮,选择 bohrium-notebook:2023-03-26镜像 和任意配置机型即可开始。

代码
文本

目标

对STEM图像有基本了解,掌握生成带缺陷标签的STEM仿真图像的方法。

在学习本教程后,你将能够:

  • 了解STEM图像及其成像原理
  • 熟悉STEM图像的读取和预处理
  • 掌握生成带缺陷标签的STEM仿真图像的方法
代码
文本

背景

1. 认识STEM图像

1.1. 什么是STEM图像

在材料领域,常常需要对材料进行实验表征,常见的表征手段包括

  • 扫描电子显微镜(SEM)
  • 透射电子显微镜(TEM)
  • 高倍率透射电子显微镜(HRTEM)
  • 扫描透射电子显微镜(STEM)

扫描电子显微镜(Scanning Electron Microscope,简称SEM)是一种通过用聚焦电子束扫描样品的表面来产生样品表面图像的电子显微镜。透射电子显微镜(Transmission electron microscope,简称TEM)是把经加速和聚集的电子束投射到非常薄的样品上,电子与样品中的原子碰撞而改变方向,从而产生立体角散射。散射角的大小与样品的密度、厚度相关,因此可以形成明暗不同的影像。

扫描透射电子显微镜(Scanning Transmission Electron Microscopy,简称STEM)是一种特殊类型的SEM,利用高能电子束穿透样品,通过检测透射电子的强度变化来获取材料内部的结构信息。它结合了SEM和TEM的原理,既可以获得样品的表面信息,也可以获得样品的内部结构信息。

STEM图像是通过扫描透射电子显微镜获取的高分辨率图像。STEM图像可以显示样品的原子级结构和成分分布,广泛应用于材料科学、生物学、化学等领域。

1.2. STEM的成像原理

在扫描电镜中,电子束与薄样品相互作用时,会有一部分电子透过样品,这一部分透射电子也可用来成像,其形成的像就是扫描透射像(STEM像)。STEM图像的形成原理如下:

  1. 电子束发生:首先,STEM中的电子枪产生高速运动的电子束。电子枪一般采用场发射电子枪(FEG)或热发射电子枪(Tungsten)。电子束通过磁透镜(磁场作用下的聚焦装置)进行聚焦。

  2. 电子束扫描:经过聚焦的电子束被扫描线圈偏转,使其在样品表面进行逐行扫描。在扫描过程中,电子束与样品发生相互作用,产生多种信号。

  3. 信号产生与检测:与样品作用的电子束会产生多种信号,如透射电子、散射电子、X射线等。这些信号可以被相应的探测器检测到。例如,透射电子可以被透射探测器检测,从而获得样品的内部结构信息;散射电子可以被散射探测器检测,从而获得样品的表面形貌信息。

  4. 图像处理与显示:检测到的信号经过放大、处理和数字化后,通过计算机软件将信号转换成图像。由于STEM可以同时获取样品的表面和内部信息,因此可以形成多种类型的图像,如透射像、散射像等。

STEM成像包含明场像(Annular bright field, ABF), 暗场像(Annular dark field, ADF)和高角环形暗场像(High angle annular dark field, HAADF)。由于各种成像模式收集的散射信号接收角度不同,因此在实验过程中可一次获取同一位置的不同图像,反应材料的不同信息。

环形明场像(ABF)

在STEM中,轴向明场检测器位于透射电子束的照射锥中心,收集的电子偏转角在θ1( θ1<10 mrads)范围内,主要包含透射电子与部分散射电子。明场成像可以形成相位衬度像、晶格像等,分辨率较暗场像更高,通常用于提供与ADF成像结果互补的图像。ABF像衬度与原子序数Z1/3成正比,因此对轻元素更为敏感。

环形暗场像(ADF)

位于在θ2(10 mrads< θ2<50 mrads)范围内的环形暗场检测器接收的电子以布拉格散射电子为主。在同样成像条件下,ADF像相对于ABF像受像差影响更小,因此图像衬度更好。

高角环形暗场像(HAADF)

在环形暗场模式下,使用HAADF检测器将接收角度进一步扩大到θ3( θ3>50 mrads)范围时,接收到的主要是高角度非相干散射电子。高角散射电子是入射电子束与样品原子内壳层1s态电子发生散射所产生的,涉及到原子核的性质,包括核的能级结构以及核所处的化学环境。 由于HAADF像是非相干高分辨像,其非相干性与样品厚度无关,且成像过程中抑制了衍射信号,平均了大部分干涉效应,只显示探测器收集的总信号强度。因此随着扫描位置的变化,HADDF成像中的图像衬度只反应样品中不同位置化学成分的变化,不会随着样品厚度和电镜的聚焦变化发生明显的变化。

2. 机器学习助力传统STEM图像分析

本文以二硫化钼(MoS2)STEM图像硫原子缺陷检测为例,描述机器学习如何助力传统STEM图像分析。

2.1. 场景描述

二硫化钼(MoS2)是一种二维材料,具有独特的物理、化学和光学性质,因此在许多应用领域具有广泛的前景,如光伏、光电子器件、传感器、催化剂等。然而,MoS2晶格中的硫原子缺陷会影响其性能和功能。因此,对MoS2中硫原子缺陷的检测具有重要的意义。

由于STEM图像能够提供有关材料晶格结构和缺陷的详细信息,因此可以用来分析二硫化钼(MoS2)中的硫原子缺陷的情况。

传统的二硫化钼STEM图像原子缺陷分析是一个繁琐的手工过程,主要依靠领域专家的手动分析,速度慢通量低,没法结合自动化工艺生产。

机器学习已经在许多领域取得了显著的进展,其中包括图像处理和计算机视觉。而STEM图像本质上就是二维的图像,因此可以考虑用机器学习来进行二硫化钼STEM图像原子缺陷分析。

2.2. 痛点

机器学习通常需要大量带有标注的数据,但是因为STEM图像的收集非常昂贵,且图像质量和设备仪器,环境,操作者的经验都有关系。同时数据标注极其困难和耗时,依赖于标注者的经验,容易受到人为偏见和错误的影响,有标注的数据特别少。

2.3. 解决方案

扫描透射电子显微镜图像仿真是指利用计算机模拟STEM成像过程,从而预测实验中可能观察到的图像。这种仿真方法可以帮助研究人员更好地理解和解释实验数据,优化实验条件,以及预测新材料的结构和性能。

因此我们可以利用STEM图像仿真来生成带有缺陷标签的MoS2 STEM图像,作为机器学习的训练数据。

代码
文本

实践

前面我们介绍了STEM图像以及用机器学习解决传统STEM图像分析的思路。在接下来的实践中,我们将重点关注前期的数据处理问题,带大家了解STEM图像的读取和预处理,以及如何生成带标签的仿真图像。

1. STEM数据的读取

STEM数据有多种存储格式,tif/tiff格式是最常见的,我们可以用tifffile package将tif/tiff格式的STEM图像读取为NumPy数组。

安装tifffile包:

代码
文本
[1]
!pip install tifffile
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Requirement already satisfied: tifffile in /opt/conda/lib/python3.8/site-packages (2023.2.3)
Requirement already satisfied: numpy in /opt/conda/lib/python3.8/site-packages (from tifffile) (1.23.5)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
代码
文本

使用tifffile的imread函数读取tiff文件:

代码
文本
[2]
import tifffile
import numpy as np
import matplotlib.pyplot as plt

# 使用tifffile的imread函数读取tiff文件
def tiff_read(file_path):
tiff_data = tifffile.imread(file_path)
return tiff_data
代码
文本

接下来,我们读入一张tiff格式的stem图像,并输出一些相关信息来直观感受一下stem图像。

代码
文本

用tifffile读入图像,并输出数据类型,图像大小,最大最小值等信息:

代码
文本
[3]
tiff_file_path = '/bohr/stem-sample-dkdg/v1/tiff_mos2_sample.tiff'
tiff_data = tiff_read(tiff_file_path)

# 显示stem图像
plt.imshow(tiff_data)

# 打印相关信息
print('data type:', tiff_data.dtype)
print('shape:', tiff_data.shape)
print('min and max value:', np.min(tiff_data), np.max(tiff_data))
print('sample value:', tiff_data[0:5,0:5])
data type: float32
shape: (630, 361)
min and max value: -0.9979306 0.58522505
sample value: [[-0.6208849  -0.6104048  -0.7464335  -0.52881366 -0.8202625 ]
 [-0.7649823  -0.72672826 -0.8509661  -0.7813429  -0.82904   ]
 [-0.8495891  -0.6903203  -0.9000491  -0.8359004  -0.88514423]
 [-0.82774454 -0.9027067  -0.8497661  -0.9080221  -0.533194  ]
 [-0.87056684 -0.8587523  -0.9317956  -0.88318145 -0.9316725 ]]
代码
文本

可以看出tif/tiff格式的STEM图像本质就是一个二维的数组,数组的值表示的就是图像的衬度。

代码
文本

接下来,我们要介绍一下STEM图像一种常用的特殊格式: DM3/DM4。

DM4(DigitalMicrograph 4)格式是一种用于存储和处理电子显微镜图像的文件格式。它是由Gatan公司开发的Digital Micrograph软件所使用的格式。DM4文件包含了原始的电子显微镜图像数据,以及与图像相关的元数据,如成像参数、设备信息和处理记录等。

python的ncempy package支持对DM4格式的文件进行读取。

代码
文本

安装ncempy包:

代码
文本
[4]
!pip install ncempy
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting ncempy
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d2/57/94c00f27441ad8d4c745bd81bb3ab7fa865d4bfe0936ae4b56ae30717390/ncempy-1.11.0-py3-none-any.whl (292 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 293.0/293.0 kB 2.5 MB/s eta 0:00:00a 0:00:01
Requirement already satisfied: matplotlib in /opt/conda/lib/python3.8/site-packages (from ncempy) (3.7.1)
Requirement already satisfied: h5py>=2.9.0 in /opt/conda/lib/python3.8/site-packages (from ncempy) (3.1.0)
Requirement already satisfied: scipy in /opt/conda/lib/python3.8/site-packages (from ncempy) (1.7.3)
Requirement already satisfied: numpy in /opt/conda/lib/python3.8/site-packages (from ncempy) (1.23.5)
Requirement already satisfied: fonttools>=4.22.0 in /opt/conda/lib/python3.8/site-packages (from matplotlib->ncempy) (4.38.0)
Requirement already satisfied: contourpy>=1.0.1 in /opt/conda/lib/python3.8/site-packages (from matplotlib->ncempy) (1.0.5)
Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.8/site-packages (from matplotlib->ncempy) (23.0)
Requirement already satisfied: pillow>=6.2.0 in /opt/conda/lib/python3.8/site-packages (from matplotlib->ncempy) (9.4.0)
Requirement already satisfied: pyparsing>=2.3.1 in /opt/conda/lib/python3.8/site-packages (from matplotlib->ncempy) (3.0.9)
Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.8/site-packages (from matplotlib->ncempy) (0.11.0)
Requirement already satisfied: python-dateutil>=2.7 in /opt/conda/lib/python3.8/site-packages (from matplotlib->ncempy) (2.8.2)
Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/lib/python3.8/site-packages (from matplotlib->ncempy) (1.4.4)
Requirement already satisfied: importlib-resources>=3.2.0 in /opt/conda/lib/python3.8/site-packages (from matplotlib->ncempy) (5.2.0)
Collecting numpy
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/2f/14/abc14a3f3663739e5d3c8fd980201d10788d75fea5b0685734227052c4f0/numpy-1.22.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 16.9/16.9 MB 31.4 MB/s eta 0:00:0000:0100:01
Requirement already satisfied: zipp>=3.1.0 in /opt/conda/lib/python3.8/site-packages (from importlib-resources>=3.2.0->matplotlib->ncempy) (3.14.0)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.8/site-packages (from python-dateutil>=2.7->matplotlib->ncempy) (1.16.0)
Installing collected packages: numpy, ncempy
  Attempting uninstall: numpy
    Found existing installation: numpy 1.23.5
    Uninstalling numpy-1.23.5:
      Successfully uninstalled numpy-1.23.5
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
moviepy 0.2.3.5 requires decorator<5.0,>=4.0.2, but you have decorator 5.1.1 which is incompatible.
cvxpy 1.2.3 requires setuptools<=64.0.2, but you have setuptools 65.6.3 which is incompatible.
Successfully installed ncempy-1.11.0 numpy-1.22.4
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
代码
文本

用ncempy的dmReader函数读取dm4文件:

代码
文本
[5]
from ncempy.io import dm
def ncempy_read_dm(file_path):
dm_data = dm.dmReader(file_path)
return dm_data
/opt/conda/lib/python3.8/site-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.16.5 and <1.23.0 is required for this version of SciPy (detected version 1.23.5
  warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
代码
文本

我们先看一下读入的dm4文件是什么样的:

代码
文本
[6]
dm_file_path = '/bohr/stem-sample-dkdg/v1/dm4_mos2_sample.dm4'
dm_data = ncempy_read_dm(dm_file_path)
print(dm_data)
{'filename': 'dm4_mos2_sample.dm4', 'data': array([[ 967.23012576, 1160.32553416, 1254.4977494 , ...,  828.72440134,
        1379.42367174, 1222.40307484],
       [1035.2746772 ,  992.19132126, 1145.64883321, ...,  708.68511017,
        1658.27359284, 1136.90055675],
       [1249.38488477,  950.68401951, 1121.51586711, ..., 1064.30334371,
        1704.27083883, 1395.04531185],
       ...,
       [ 654.58232617,  438.44134418,  646.50415879, ...,  463.79631569,
         799.58801562,  747.4419807 ],
       [ 682.56308941,  233.6421492 ,  277.14343299, ...,  403.48537378,
         857.12368648,  708.42996477],
       [ 847.07290631,  902.94458202,  529.18860425, ...,  850.83904919,
         739.67700392,  740.07716117]]), 'pixelUnit': ['nm', 'nm'], 'pixelSize': [0.013118, 0.013118], 'coords': [array([ 0.4722,  0.4853,  0.4984,  0.5116,  0.5247,  0.5378,  0.5509,
        0.564 ,  0.5771,  0.5903,  0.6034,  0.6165,  0.6296,  0.6427,
        0.6559,  0.669 ,  0.6821,  0.6952,  0.7083,  0.7214,  0.7346,
        0.7477,  0.7608,  0.7739,  0.787 ,  0.8002,  0.8133,  0.8264,
        0.8395,  0.8526,  0.8657,  0.8789,  0.892 ,  0.9051,  0.9182,
        0.9313,  0.9444,  0.9576,  0.9707,  0.9838,  0.9969,  1.01  ,
        1.0232,  1.0363,  1.0494,  1.0625,  1.0756,  1.0887,  1.1019,
        1.115 ,  1.1281,  1.1412,  1.1543,  1.1675,  1.1806,  1.1937,
        1.2068,  1.2199,  1.233 ,  1.2462,  1.2593,  1.2724,  1.2855,
        1.2986,  1.3118,  1.3249,  1.338 ,  1.3511,  1.3642,  1.3773,
        1.3905,  1.4036,  1.4167,  1.4298,  1.4429,  1.456 ,  1.4692,
        1.4823,  1.4954,  1.5085,  1.5216,  1.5348,  1.5479,  1.561 ,
        1.5741,  1.5872,  1.6003,  1.6135,  1.6266,  1.6397,  1.6528,
        1.6659,  1.6791,  1.6922,  1.7053,  1.7184,  1.7315,  1.7446,
        1.7578,  1.7709,  1.784 ,  1.7971,  1.8102,  1.8234,  1.8365,
        1.8496,  1.8627,  1.8758,  1.8889,  1.9021,  1.9152,  1.9283,
        1.9414,  1.9545,  1.9677,  1.9808,  1.9939,  2.007 ,  2.0201,
        2.0332,  2.0464,  2.0595,  2.0726,  2.0857,  2.0988,  2.1119,
        2.1251,  2.1382,  2.1513,  2.1644,  2.1775,  2.1907,  2.2038,
        2.2169,  2.23  ,  2.2431,  2.2562,  2.2694,  2.2825,  2.2956,
        2.3087,  2.3218,  2.335 ,  2.3481,  2.3612,  2.3743,  2.3874,
        2.4005,  2.4137,  2.4268,  2.4399,  2.453 ,  2.4661,  2.4793,
        2.4924,  2.5055,  2.5186,  2.5317,  2.5448,  2.558 ,  2.5711,
        2.5842,  2.5973,  2.6104,  2.6236,  2.6367,  2.6498,  2.6629,
        2.676 ,  2.6891,  2.7023,  2.7154,  2.7285,  2.7416,  2.7547,
        2.7678,  2.781 ,  2.7941,  2.8072,  2.8203,  2.8334,  2.8466,
        2.8597,  2.8728,  2.8859,  2.899 ,  2.9121,  2.9253,  2.9384,
        2.9515,  2.9646,  2.9777,  2.9909,  3.004 ,  3.0171,  3.0302,
        3.0433,  3.0564,  3.0696,  3.0827,  3.0958,  3.1089,  3.122 ,
        3.1352,  3.1483,  3.1614,  3.1745,  3.1876,  3.2007,  3.2139,
        3.227 ,  3.2401,  3.2532,  3.2663,  3.2795,  3.2926,  3.3057,
        3.3188,  3.3319,  3.345 ,  3.3582,  3.3713,  3.3844,  3.3975,
        3.4106,  3.4237,  3.4369,  3.45  ,  3.4631,  3.4762,  3.4893,
        3.5025,  3.5156,  3.5287,  3.5418,  3.5549,  3.568 ,  3.5812,
        3.5943,  3.6074,  3.6205,  3.6336,  3.6468,  3.6599,  3.673 ,
        3.6861,  3.6992,  3.7123,  3.7255,  3.7386,  3.7517,  3.7648,
        3.7779,  3.7911,  3.8042,  3.8173,  3.8304,  3.8435,  3.8566,
        3.8698,  3.8829,  3.896 ,  3.9091,  3.9222,  3.9354,  3.9485,
        3.9616,  3.9747,  3.9878,  4.0009,  4.0141,  4.0272,  4.0403,
        4.0534,  4.0665,  4.0796,  4.0928,  4.1059,  4.119 ,  4.1321,
        4.1452,  4.1584,  4.1715,  4.1846,  4.1977,  4.2108,  4.2239,
        4.2371,  4.2502,  4.2633,  4.2764,  4.2895,  4.3027,  4.3158,
        4.3289,  4.342 ,  4.3551,  4.3682,  4.3814,  4.3945,  4.4076,
        4.4207,  4.4338,  4.447 ,  4.4601,  4.4732,  4.4863,  4.4994,
        4.5125,  4.5257,  4.5388,  4.5519,  4.565 ,  4.5781,  4.5913,
        4.6044,  4.6175,  4.6306,  4.6437,  4.6568,  4.67  ,  4.6831,
        4.6962,  4.7093,  4.7224,  4.7355,  4.7487,  4.7618,  4.7749,
        4.788 ,  4.8011,  4.8143,  4.8274,  4.8405,  4.8536,  4.8667,
        4.8798,  4.893 ,  4.9061,  4.9192,  4.9323,  4.9454,  4.9586,
        4.9717,  4.9848,  4.9979,  5.011 ,  5.0241,  5.0373,  5.0504,
        5.0635,  5.0766,  5.0897,  5.1029,  5.116 ,  5.1291,  5.1422,
        5.1553,  5.1684,  5.1816,  5.1947,  5.2078,  5.2209,  5.234 ,
        5.2472,  5.2603,  5.2734,  5.2865,  5.2996,  5.3127,  5.3259,
        5.339 ,  5.3521,  5.3652,  5.3783,  5.3914,  5.4046,  5.4177,
        5.4308,  5.4439,  5.457 ,  5.4702,  5.4833,  5.4964,  5.5095,
        5.5226,  5.5357,  5.5489,  5.562 ,  5.5751,  5.5882,  5.6013,
        5.6145,  5.6276,  5.6407,  5.6538,  5.6669,  5.68  ,  5.6932,
        5.7063,  5.7194,  5.7325,  5.7456,  5.7588,  5.7719,  5.785 ,
        5.7981,  5.8112,  5.8243,  5.8375,  5.8506,  5.8637,  5.8768,
        5.8899,  5.9031,  5.9162,  5.9293,  5.9424,  5.9555,  5.9686,
        5.9818,  5.9949,  6.008 ,  6.0211,  6.0342,  6.0473,  6.0605,
        6.0736,  6.0867,  6.0998,  6.1129,  6.1261,  6.1392,  6.1523,
        6.1654,  6.1785,  6.1916,  6.2048,  6.2179,  6.231 ,  6.2441,
        6.2572,  6.2704,  6.2835,  6.2966,  6.3097,  6.3228,  6.3359,
        6.3491,  6.3622,  6.3753,  6.3884,  6.4015,  6.4147,  6.4278,
        6.4409,  6.454 ,  6.4671,  6.4802,  6.4934,  6.5065,  6.5196,
        6.5327,  6.5458,  6.559 ,  6.5721,  6.5852,  6.5983,  6.6114,
        6.6245,  6.6377,  6.6508,  6.6639,  6.677 ,  6.6901,  6.7032,
        6.7164,  6.7295,  6.7426,  6.7557,  6.7688,  6.782 ,  6.7951,
        6.8082,  6.8213,  6.8344,  6.8475,  6.8607,  6.8738,  6.8869,
        6.9   ,  6.9131,  6.9263,  6.9394,  6.9525,  6.9656,  6.9787,
        6.9918,  7.005 ,  7.0181,  7.0312,  7.0443,  7.0574,  7.0706,
        7.0837,  7.0968,  7.1099,  7.123 ,  7.1361,  7.1493,  7.1624,
        7.1755,  7.1886,  7.2017,  7.2149,  7.228 ,  7.2411,  7.2542,
        7.2673,  7.2804,  7.2936,  7.3067,  7.3198,  7.3329,  7.346 ,
        7.3591,  7.3723,  7.3854,  7.3985,  7.4116,  7.4247,  7.4379,
        7.451 ,  7.4641,  7.4772,  7.4903,  7.5034,  7.5166,  7.5297,
        7.5428,  7.5559,  7.569 ,  7.5822,  7.5953,  7.6084,  7.6215,
        7.6346,  7.6477,  7.6609,  7.674 ,  7.6871,  7.7002,  7.7133,
        7.7265,  7.7396,  7.7527,  7.7658,  7.7789,  7.792 ,  7.8052,
        7.8183,  7.8314,  7.8445,  7.8576,  7.8708,  7.8839,  7.897 ,
        7.9101,  7.9232,  7.9363,  7.9495,  7.9626,  7.9757,  7.9888,
        8.0019,  8.015 ,  8.0282,  8.0413,  8.0544,  8.0675,  8.0806,
        8.0938,  8.1069,  8.12  ,  8.1331,  8.1462,  8.1593,  8.1725,
        8.1856,  8.1987,  8.2118,  8.2249,  8.2381,  8.2512,  8.2643,
        8.2774,  8.2905,  8.3036,  8.3168,  8.3299,  8.343 ,  8.3561,
        8.3692,  8.3824,  8.3955,  8.4086,  8.4217,  8.4348,  8.4479,
        8.4611,  8.4742,  8.4873,  8.5004,  8.5135,  8.5267,  8.5398,
        8.5529,  8.566 ,  8.5791,  8.5922,  8.6054,  8.6185,  8.6316,
        8.6447,  8.6578,  8.6709,  8.6841,  8.6972,  8.7103,  8.7234,
        8.7365,  8.7497,  8.7628,  8.7759,  8.789 ,  8.8021,  8.8152,
        8.8284,  8.8415,  8.8546,  8.8677,  8.8808,  8.894 ,  8.9071,
        8.9202,  8.9333,  8.9464,  8.9595,  8.9727,  8.9858,  8.9989,
        9.012 ,  9.0251,  9.0383,  9.0514,  9.0645,  9.0776,  9.0907,
        9.1038,  9.117 ,  9.1301,  9.1432,  9.1563,  9.1694,  9.1826,
        9.1957,  9.2088,  9.2219,  9.235 ,  9.2481,  9.2613,  9.2744,
        9.2875,  9.3006,  9.3137,  9.3268,  9.34  ,  9.3531,  9.3662,
        9.3793,  9.3924,  9.4056,  9.4187,  9.4318,  9.4449,  9.458 ,
        9.4711,  9.4843,  9.4974,  9.5105,  9.5236,  9.5367,  9.5499,
        9.563 ,  9.5761,  9.5892,  9.6023,  9.6154,  9.6286,  9.6417,
        9.6548,  9.6679,  9.681 ,  9.6942,  9.7073,  9.7204,  9.7335,
        9.7466,  9.7597,  9.7729,  9.786 ,  9.7991,  9.8122,  9.8253,
        9.8385,  9.8516,  9.8647,  9.8778,  9.8909,  9.904 ,  9.9172,
        9.9303,  9.9434,  9.9565,  9.9696,  9.9827,  9.9959, 10.009 ,
       10.0221, 10.0352, 10.0483, 10.0615, 10.0746, 10.0877, 10.1008,
       10.1139, 10.127 , 10.1402, 10.1533, 10.1664, 10.1795, 10.1926,
       10.2058, 10.2189, 10.232 , 10.2451, 10.2582, 10.2713, 10.2845,
       10.2976, 10.3107, 10.3238, 10.3369, 10.3501, 10.3632, 10.3763,
       10.3894, 10.4025, 10.4156, 10.4288, 10.4419, 10.455 , 10.4681,
       10.4812, 10.4944, 10.5075, 10.5206, 10.5337, 10.5468, 10.5599,
       10.5731, 10.5862, 10.5993, 10.6124, 10.6255, 10.6386, 10.6518,
       10.6649, 10.678 , 10.6911, 10.7042, 10.7174, 10.7305, 10.7436,
       10.7567, 10.7698, 10.7829, 10.7961, 10.8092, 10.8223, 10.8354,
       10.8485, 10.8617, 10.8748, 10.8879, 10.901 , 10.9141, 10.9272,
       10.9404, 10.9535, 10.9666, 10.9797, 10.9928, 11.006 , 11.0191,
       11.0322, 11.0453, 11.0584, 11.0715, 11.0847, 11.0978, 11.1109,
       11.124 , 11.1371, 11.1503, 11.1634, 11.1765, 11.1896, 11.2027,
       11.2158, 11.229 , 11.2421, 11.2552, 11.2683, 11.2814, 11.2945,
       11.3077, 11.3208, 11.3339, 11.347 , 11.3601, 11.3733, 11.3864,
       11.3995, 11.4126, 11.4257, 11.4388, 11.452 , 11.4651, 11.4782,
       11.4913, 11.5044, 11.5176, 11.5307, 11.5438, 11.5569, 11.57  ,
       11.5831, 11.5963, 11.6094, 11.6225, 11.6356, 11.6487, 11.6619,
       11.675 , 11.6881, 11.7012, 11.7143, 11.7274, 11.7406, 11.7537,
       11.7668, 11.7799, 11.793 , 11.8062, 11.8193, 11.8324, 11.8455,
       11.8586, 11.8717, 11.8849, 11.898 , 11.9111, 11.9242, 11.9373,
       11.9504, 11.9636, 11.9767, 11.9898, 12.0029, 12.016 , 12.0292,
       12.0423, 12.0554, 12.0685, 12.0816, 12.0947, 12.1079, 12.121 ,
       12.1341, 12.1472, 12.1603, 12.1735, 12.1866, 12.1997, 12.2128,
       12.2259, 12.239 , 12.2522, 12.2653, 12.2784, 12.2915, 12.3046,
       12.3178, 12.3309, 12.344 , 12.3571, 12.3702, 12.3833, 12.3965,
       12.4096, 12.4227, 12.4358, 12.4489, 12.4621, 12.4752, 12.4883,
       12.5014, 12.5145, 12.5276, 12.5408, 12.5539, 12.567 , 12.5801,
       12.5932, 12.6063, 12.6195, 12.6326, 12.6457, 12.6588, 12.6719,
       12.6851, 12.6982, 12.7113, 12.7244, 12.7375, 12.7506, 12.7638,
       12.7769, 12.79  , 12.8031, 12.8162, 12.8294, 12.8425, 12.8556,
       12.8687, 12.8818, 12.8949, 12.9081, 12.9212, 12.9343, 12.9474,
       12.9605, 12.9737, 12.9868, 12.9999, 13.013 , 13.0261, 13.0392,
       13.0524, 13.0655, 13.0786, 13.0917, 13.1048, 13.118 , 13.1311,
       13.1442, 13.1573, 13.1704, 13.1835, 13.1967, 13.2098, 13.2229,
       13.236 , 13.2491, 13.2622, 13.2754, 13.2885, 13.3016, 13.3147,
       13.3278, 13.341 , 13.3541, 13.3672, 13.3803, 13.3934, 13.4065,
       13.4197]), array([ 0.2755,  0.2886,  0.3017, ..., 13.3935, 13.4066, 13.4197])]}
代码
文本

可以看出读进来的dm4文件包含以下字段:

  • data: 二维数组,图像的内容,数据类型:uint32, float32等
  • pixelUnit:pixel表示的物理距离的单位,比如‘nm’
  • pixelsize:每个pixel代表的物理距离,单位为pixelUnit
  • coords:二维数组,data数组中的值所在的物理坐标
代码
文本

接下来,我们再看一下dm4图像长什么样:

代码
文本
[7]
dm_image_data = dm_data['data']
plt.imshow(dm_image_data)
<matplotlib.image.AxesImage at 0x7f41bf5dbcd0>
代码
文本

2. STEM图像的归一化

相对于tiff格式的stem图像只有二维的图像信息,dm4格式的stem图像还包含很多元数据信息。其中Pixel size是DM4文件中的一个重要参数,指的是图像中每个像素所代表的实际物理尺寸。

Pixel size通常以纳米(nm)或皮米(pm)为单位,用来表示图像的分辨率和缩放级别。例如,如果DM4文件的pixel size为2 nm,那么每个像素代表2纳米的实际物理距离。

下图直观显示了在同样256*256的图像大小中,不同pixel size的stem图像:

image.png

因此,如果要在机器学习中使用stem图像的话,pixel size的归一化问题是不得不考虑的问题。

另外,由于tiff格式或者dm4格式都没有对图像数据的数据类型和值域范围做出明确规范,因此,不同stem图像的数据类型和值域范围差异很大,也需要先对图像数据的值进行归一化处理。

鉴于篇幅限制,关于如何进行归一化,不在本文的探讨范围。

代码
文本

3. 仿真图像的生成

这里以abTEM为例,说明如何生成带硫原子缺失标签的二硫化钼stem仿真图像。

安装abtem包

代码
文本
[8]
!pip install abtem
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting abtem
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/23/bb/30276edd8e67b845498b259e7200bd3cf11b372258dbb74a9911501f02de/abtem-1.0.0b34-py3-none-any.whl (540 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 540.7/540.7 kB 3.2 MB/s eta 0:00:00a 0:00:01
Requirement already satisfied: psutil in /opt/conda/lib/python3.8/site-packages (from abtem) (5.9.0)
Requirement already satisfied: scipy in /opt/conda/lib/python3.8/site-packages (from abtem) (1.7.3)
Requirement already satisfied: tqdm in /opt/conda/lib/python3.8/site-packages (from abtem) (4.64.1)
Requirement already satisfied: ase in /opt/conda/lib/python3.8/site-packages (from abtem) (3.22.1)
Collecting pyfftw
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/4c/bb/22ce52cebdee975034b905fb79086d4c857e7ef733011ae09b9d04d03d11/pyFFTW-0.13.1-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 18.8 MB/s eta 0:00:00a 0:00:01
Requirement already satisfied: h5py in /opt/conda/lib/python3.8/site-packages (from abtem) (3.1.0)
Requirement already satisfied: numpy in /opt/conda/lib/python3.8/site-packages (from abtem) (1.22.4)
Requirement already satisfied: numba in /opt/conda/lib/python3.8/site-packages (from abtem) (0.56.4)
Requirement already satisfied: imageio in /opt/conda/lib/python3.8/site-packages (from abtem) (2.9.0)
Requirement already satisfied: matplotlib in /opt/conda/lib/python3.8/site-packages (from abtem) (3.7.1)
Requirement already satisfied: pyparsing>=2.3.1 in /opt/conda/lib/python3.8/site-packages (from matplotlib->abtem) (3.0.9)
Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.8/site-packages (from matplotlib->abtem) (0.11.0)
Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.8/site-packages (from matplotlib->abtem) (23.0)
Requirement already satisfied: fonttools>=4.22.0 in /opt/conda/lib/python3.8/site-packages (from matplotlib->abtem) (4.38.0)
Requirement already satisfied: python-dateutil>=2.7 in /opt/conda/lib/python3.8/site-packages (from matplotlib->abtem) (2.8.2)
Requirement already satisfied: contourpy>=1.0.1 in /opt/conda/lib/python3.8/site-packages (from matplotlib->abtem) (1.0.5)
Requirement already satisfied: pillow>=6.2.0 in /opt/conda/lib/python3.8/site-packages (from matplotlib->abtem) (9.4.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/lib/python3.8/site-packages (from matplotlib->abtem) (1.4.4)
Requirement already satisfied: importlib-resources>=3.2.0 in /opt/conda/lib/python3.8/site-packages (from matplotlib->abtem) (5.2.0)
Requirement already satisfied: llvmlite<0.40,>=0.39.0dev0 in /opt/conda/lib/python3.8/site-packages (from numba->abtem) (0.39.1)
Requirement already satisfied: importlib-metadata in /opt/conda/lib/python3.8/site-packages (from numba->abtem) (6.0.0)
Requirement already satisfied: setuptools in /opt/conda/lib/python3.8/site-packages (from numba->abtem) (65.6.3)
Requirement already satisfied: zipp>=3.1.0 in /opt/conda/lib/python3.8/site-packages (from importlib-resources>=3.2.0->matplotlib->abtem) (3.14.0)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.8/site-packages (from python-dateutil>=2.7->matplotlib->abtem) (1.16.0)
Installing collected packages: pyfftw, abtem
Successfully installed abtem-1.0.0b34 pyfftw-0.13.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
代码
文本
[9]
from abtem import __version__
print('current version:', __version__)

import matplotlib.pyplot as plt
from ase.io import read, write
from ase.build import mx2

from abtem import *
from abtem.structures import orthogonalize_cell
import os
import random

current version: 1.0.0beta34
代码
文本

首先,我们定义一个函数,用来可视化原子模型,帮助我们直观地感受每一步操作对于原子模型的改动:

代码
文本
[10]
# 可视化原子模型
def show_atoms_top_and_side_view(atoms):
fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(12,4))

show_atoms(atoms, ax=ax1, title='Top view')
show_atoms(atoms, ax=ax2, plane='xz', title='Side view')
show_atoms(atoms, ax=ax3, plane='yz', title='Side view')
代码
文本

创建原子模型

abTEM 用 Atomic Simulation Environment (ASE)创建原子模型.

我们用ASE mx2函数创建MoS2的结构模型.

ase.build.mx2(formula='MoS2', kind='2H', a=3.18, thickness=3.19, size=(1, 1, 1), vacuum=None)

Create three-layer 2D materials with hexagonal structure.
For metal dichalcogenites, etc.
The kind argument accepts ‘2H’, which gives a mirror plane symmetry and ‘1T’, which gives an inversion symmetry.

代码
文本
[11]
atoms = mx2(formula='MoS2', kind='2H', a=3.18, thickness=3.19, size=(1, 1, 1), vacuum=None)
atoms = orthogonalize_cell(atoms)

repetitions = (6, 4, 1)
atoms *= repetitions

# 在z轴前后各加1Å的空白
atoms.center(vacuum=1, axis=2)

show_atoms_top_and_side_view(atoms)

# 可以用以下代码保存生成的原子模型
# os.makedirs('data', exist_ok=True)
# write('mos2.cif', atoms)
代码
文本

硫原子随机缺失

在生成的MoS2原子模型中,模拟硫原子缺失的情况,随机删除一些硫原子,通过修改 defect_ratio 设置s原子缺失的比例

代码
文本
[12]
# 每个位置的原子序数
atoms_ele_num = atoms.arrays['numbers']
# 每个原子的坐标
atoms_positions = atoms.arrays['positions']

# s原子的序数为16
S = 16

# 找出S原子所在序列
S_index = np.where(atoms_ele_num== S)[0]

#找出S原子的所有坐标, numpy array [n,3]
S_positions = atoms_positions[S_index]

# 设置S原子缺失的比例
defect_ratio = 0.3
S_cnt = len(S_index)

# 随机生成要删除的s原子
S_delete_cnt = random.randint(1, int(S_cnt*defect_ratio))
S_delete_index = S_index[random.sample(range(S_cnt), S_delete_cnt)]

print('deleted S index',S_delete_index)

# 删除对应的s原子
atoms_ele_num_modified = np.delete(atoms_ele_num,S_delete_index)
atoms_positions_modified = np.delete(atoms_positions,S_delete_index,axis=0)

# 构建新的模型
atoms_modified = atoms.copy()
atoms_modified.arrays['numbers'] = atoms_ele_num_modified
atoms_modified.arrays['positions'] = atoms_positions_modified

# 显示新原子模型
show_atoms_top_and_side_view(atoms_modified)

deleted S index [110  55]
代码
文本

生成硫原子缺失的标签

接下来,我们需要生成s原子缺失的标签

代码
文本
[13]
# 删除的s原子的z坐标
deleted_z_axis = np.array([atoms_positions[del_idx][2] for del_idx in S_delete_index])

# 计算s原子z坐标的平均值,大于该值的为上层s原子,小于该值的为下层s原子
mean_z_axis = np.array([z_axis[2] for z_axis in S_positions]).mean()

# 得到删除的上层/下层s原子
upper_indexs = S_delete_index[np.where(deleted_z_axis > mean_z_axis)[0]]
lower_indexs = S_delete_index[np.where(deleted_z_axis < mean_z_axis)[0]]

print('s delete indexs', S_delete_index)
print('delete position', atoms_positions[S_delete_index])
print('upper_indexs',upper_indexs)
print('lower_indexs', lower_indexs)

# 搜索双s原子都缺失的位置
double_indexs = []

for upper_index in upper_indexs:
x0 = atoms_positions[upper_index][0]
y0 = atoms_positions[upper_index][1]
for lower_index in lower_indexs:
x1 = atoms_positions[lower_index][0]
y1 = atoms_positions[lower_index][1]
if (abs(x0-x1)<0.1 and abs(y0-y1)<0.1):
double_indexs.append(upper_index)
double_indexs.append(lower_index)

# 双s原子缺失
atoms_double = atoms.copy()

ele_double = atoms_ele_num[double_indexs]
positions_double = atoms_positions[double_indexs]
atoms_double.arrays['numbers'] = ele_double
atoms_double.arrays['positions'] = positions_double

# 上层s原子缺失
atoms_upper = atoms.copy()
upper_indexs = [i for i in upper_indexs if i not in double_indexs]

ele_upper = atoms_ele_num[upper_indexs]
positions_upper = atoms_positions[upper_indexs]
atoms_upper.arrays['numbers'] = ele_upper
atoms_upper.arrays['positions'] = positions_upper

# 下层s原子缺失
atoms_lower = atoms.copy()
lower_indexs = [i for i in lower_indexs if i not in double_indexs]

ele_lower = atoms_ele_num[lower_indexs]
positions_lower = atoms_positions[lower_indexs]
atoms_lower.arrays['numbers'] = ele_lower
atoms_lower.arrays['positions'] = positions_lower

print('upper_indexs',upper_indexs)
print('lower_indexs', lower_indexs)
print('double_index', double_indexs)

# 可视化s原子缺失的位置
fig,axes=plt.subplots(figsize=(7,8))
plt.scatter(positions_upper[:, 0], positions_upper[:, 1], marker='o',label='upper S atoms')
plt.scatter(positions_lower[:, 0], positions_lower[:, 1], marker='>',label='lower S atoms')
plt.scatter(positions_double[:, 0],positions_double[:, 1], c='r', marker='*',label='double_S atoms')
axes.set_xlabel('angstrom',fontsize=15)
axes.set_ylabel('angstrom',fontsize=15)
axes.legend()
s delete indexs [110  55]
delete position [[14.31       11.93383006  1.        ]
 [ 7.95        6.4259085   4.19      ]]
upper_indexs [55]
lower_indexs [110]
upper_indexs [55]
lower_indexs [110]
double_index []
<matplotlib.legend.Legend at 0x7f41f09837c0>
代码
文本

下面显示s原子缺失的原子模型,分别为有s原子缺失的,缺失的上层s原子,缺失的下层s原子,缺失的双层原子:

代码
文本
[14]
show_atoms_top_and_side_view(atoms_modified)
show_atoms_top_and_side_view(atoms_lower)
show_atoms_top_and_side_view(atoms_upper)
show_atoms_top_and_side_view(atoms_double)

代码
文本

生成仿真图像

下面,我们将生成带s原子缺失的仿真图像:

代码
文本
[15]
# simulate temperature effects
sigmas = {'Mo': .1, 'S': .1} # standard deviations of thermal vibrations
num_configs = 30 # number of frozen phonon configurations

frozen_phonons = FrozenPhonons(atoms_modified, num_configs=num_configs, sigmas=sigmas)
potential = Potential(frozen_phonons,
gpts=256,
slice_thickness=1,
parametrization='kirkland',
projection='infinite')
potential = potential.build()

probe = SMatrix(energy=80e3,
semiangle_cutoff=20,
expansion_cutoff=20,
rolloff=0.1,
defocus=40,
Cs=3e5)

detector = FlexibleAnnularDetector()

##### Define scan region #####

end = (potential.extent[0] / repetitions[0], potential.extent[1] / repetitions[1])

gridscan = GridScan(start=[0, 0], end=end, sampling=probe.ctf.nyquist_sampling * .9)

measurement = probe.scan(gridscan, detector, potential)
OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.
代码
文本

对生成的仿真图像,我们可以设置不同的采样系数,得到的图像的pixel size也就不一样,下图显示了不同采样系数下的结果:

代码
文本
[16]
haadf_measurement = measurement.integrate(50, 150)

interp_haadf_measurement_sampling_005 = haadf_measurement.tile((5,5)).interpolate(.05)
interp_haadf_measurement_sampling_01 = haadf_measurement.tile((5,5)).interpolate(.1)

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10,4))

interp_haadf_measurement_sampling_005.show(ax=ax1)
interp_haadf_measurement_sampling_01.show(ax=ax2)

plt.figure()
plt.subplot(1,2,1)
plt.imshow(interp_haadf_measurement_sampling_005.array[0:128,0:128])
plt.subplot(1,2,2)
plt.imshow(interp_haadf_measurement_sampling_01.array[0:128,0:128])
<matplotlib.image.AxesImage at 0x7f41de968190>
代码
文本
Machine Learning
AI4S
STEM
Machine LearningAI4SSTEM
已赞15
本文被以下合集收录
2023机器学习及其在化学中的应用
昌珺涵
更新于 2024-09-12
22 篇158 人关注
机器学习与DFT精华帖
gtang
更新于 2024-09-10
38 篇21 人关注
推荐阅读
公开
樊哲勇《分子动力学模拟》python实例 | 2.使用近邻列表的分子动力学模拟程序
樊哲勇《分子动力学模拟》程序代码python实现
樊哲勇《分子动力学模拟》程序代码python实现
mosey
更新于 2024-07-04
1 转存文件
公开
TBPLaS入门教程
紧束缚模型对角化方法时间演化方法大尺度模拟中文
紧束缚模型对角化方法时间演化方法大尺度模拟中文
李云海
发布于 2023-07-26
9 赞7 转存文件8 评论
评论
 dm_file_path = '/boh...

Hui_Zhou

10-19 03:20
"/data/deepbrew/stem/data/sample/dm4_mos2_sample.dm4" 文件找不到
评论
 可以看出读进来的dm4文件包含以下字段:...

李扬帆

2023-07-27
coords物理坐标是指data中每一个像素点的物理位置吗?似乎可以从data像素的相对位置以及pixelunit/size直接算出来?
评论
 #### 生成硫原子缺失的标签 接下来...

李扬帆

2023-07-27
这里是根据z坐标来区分上下层原子,感觉实际图像中想用到相同的方法,就需要根据一个根据图像就可以直接生成原子模型的函数(通过亮点确认Mo原子,再通过暗点亮度来确定S的缺失?)

李扬帆

2023-07-28
回复 李扬帆 https://bohrium.dp.tech/notebook/9a1effee1d3c4e21bad79eabc51f0d36 #Mo-location.ipynb
评论