Data-Quality-Navigated Machine Learning Strategy with Chemical Intuition to Improve Generalization
Songran Yang,Ming Sun,Chaojie Shi,Yiran Liu,Yanzhi Guo,Yijing Liu,Zhiyun Lu,Yan Huang,Xuemei Pu
DOI: https://doi.org/10.1021/acs.jctc.4c00969
2024-11-28
Journal of Chemical Theory and Computation
Abstract:Generalizing real-world data has been one of the most difficult challenges for application of machine learning (ML) in practice. Most ML works focused on improvements in algorithms and feature representations. However, the data quality, as the foundation of ML, has been largely overlooked, also leading to the absence of data evaluation and processing methods in ML fields. Motivated by the challenge and need, we selected an important but difficult reorganization energy (RE) prediction task as a...
chemistry, physical,physics, atomic, molecular & chemical