Accurate Property Prediction with Interpretable Machine Learning Model for Small Datasets Via Transformed Atom Vector

Xinyu Chen,Shuaihua Lu,Xinyang Wan,Qian Chen,Qionghua Zhou,Jinlan Wang
DOI: https://doi.org/10.1103/physrevmaterials.6.123803
IF: 3.98
2022-01-01
Physical Review Materials
Abstract:Machine learning techniques can greatly accelerate material discovery while high-dimensional representation often causes overfitting problems and leads to poor model performance. Building a structure-property relation-ship with low-dimensional representation is always an open challenge, especially for diverse structures within small datasets. To address this issue, a low-dimensional representation named the transformed atom vector (TAV) is proposed, which is a crystal-graph-based descriptor. As an example, we apply it in two-dimensional materials and predict the band gap at the Heyd-Scuseria-Ernzerhof level with only 500 samples at acceptable accuracy. Moreover, TAV representation retains interpretability, based on which a property-oriented search method through element substitution is developed. This work provides a universal low-dimensional representation containing rich material information, as well as an intuitive interpretation approach for material design, which improves the feasibility and interpretability of machine learning models for small datasets and helps realize accurate yet meaningful property prediction at a lower cost.
What problem does this paper attempt to address?