High-Throughput Computational Screening of Metal-Organic Frameworks for CH4/H2 Separation by Synergizing Machine Learning and Molecular Simulation
Wang Shihui,Xue Xiaoyu,Cheng Min,Chen Shaochen,Liu Chong,Zhou Li,Bi Kexin,Ji Xu
DOI: https://doi.org/10.6023/a22010031
2022-01-01
Acta Chimica Sinica
Abstract:In this work, a hierarchical screening strategy by synergizing machine learning (ML) and molecular simulation was proposed to identify the optimal adsorbents for CH4/H-2 separation from 134185 hypothetical metal-organic frameworks (MOFs). At the initial screening, MOF materials with inappropriate pore size and/or volumetric surface area were removed from the total database, resulting in a list of 62278 MOFs. Among them, 10% MOFs were randomly chosen and grand canonical Monte Carlo (GCMC) simulations were performed to calculate the adsorption behaviors of CH4/H-2 mixture in these MOFs under vacuum swing adsorption (VSA) and pressure swing adsorption (PSA) conditions. Following this, structural/chemical descriptors and corresponding adsorbent performance scores (APS) of the selected MOFs were employed to develop the random forest (RF) models for VSA and PSA processes. Compared with the accuracy of other ML algorithms, covering support vector machine, k-nearest neighbor, decision tree, and artificial neural network, the proposed model exhibits the optimum predictive power. Meanwhile, the hybrid of structural and chemical descriptors, as well as the application of the preliminary screening strategy improve the accuracy of the RF model. Thus, it was used to predict the APS values of the remaining 90% MOFs in the next stage of screening, and the top 1000 candidates were screened out according to the results. GCMC simulations were subsequently carried out on the top candidates to refine the predictions, and then ten MOFs with the best CH4/H-2 separation performance were obtained under VSA and PSA conditions, respectively. The high performance of the optimal MOFs was verified by comparison with well-studied MOF materials in the literature. Finally, the feature importance of the descriptors was interpreted by the Shapley Additive Explanations. The result reveals the potential for the developed model to transfer between the two operating conditions due to the consistency of the dominant descriptors, which also provides an efficient pathway for rapid screening of promising MOF adsorbents in CH4/H-2 separation suitable for different operation scenarios.
What problem does this paper attempt to address?