Combination of explainable machine learning and conceptual density functional theory: applications for the study of key solvation mechanisms

I-Ting Ho,Milena Matysik,Liliana Montano Herrera,Jiyoung Yang,Ralph Joachim Guderlei,Michael Laussegger,Bernhard Schrantz,Regine Hammer,Ramón Alain Miranda-Quintana,Jens Smiatek
DOI: https://doi.org/10.1039/d2cp04428e
2022-11-30
Abstract:We present explainable machine learning approaches for the accurate prediction and understanding of solvation free energies, enthalpies, and entropies for different salts in various protic and aprotic solvents. As key input features, we use fundamental contributions from the conceptual density functional theory (DFT) of solutions. The most accurate models with the highest prediction accuracy for the experimental validation data set are decision tree-based approaches such as extreme gradient boosting and extra trees, which highlight the non-linear influence of feature values on target predictions. The detailed assessment of the importance of features in terms of Gini importance criteria as well as Shapley Additive Explanations (SHAP) and permutation and reduction approaches underlines the prominent role of anion and cation solvation effects in combination with fundamental electronic properties of the solvents. These results are reasonably consistent with previous assumptions and provide a solid rationale for more recent theoretical approaches.
What problem does this paper attempt to address?