Prediction of Critical Micelle Concentration (cmc) of Surfactants Based on Structural Differentiation Using Machine Learning

Jiaying Chen,Linxi Hou,Jing Nan,Bangqing Ni,Wei Dai,Xin Ge
DOI: https://doi.org/10.1016/j.colsurfa.2024.135276
IF: 5.518
2024-01-01
Colloids and Surfaces A Physicochemical and Engineering Aspects
Abstract:Measuring the critical micelle concentration (CMC) of surfactants holds significant importance in comprehending their interfacial properties. However, traditional methods suffer from issues such as lengthy testing durations, low experimental accuracy, and the complexity of theoretical calculations. Herein, a method for predicting CMC is developed by using machine learning (ML) based on the structural differentiation of surfactants. A quantitative structure-property relationship (QSPR) model that can automatically classify and identify surfactants based on differences in their head groups, was established by collecting a diverse CMC dataset of 779 surfactants. Each surfactant molecule is quantitatively chemically described using molecular descriptors to train 5 different ML models by using linear regression and tree-based algorithms. By evaluating model accuracy, the model was established by automatically selecting light gradient boosting machine (LGBM) and gradient boosting decision tree (GBDT) as the optimal algorithms for ionic and nonionic surfactants, respectively. The overall prediction accuracy of the model achieved R-2 = 0.944. Our model significantly outperforms the graph convolutional neural network (GCN) model by comparing prediction accuracy on the same surfactant data. Besides, principal component analysis (PCA) highlighted disparities in feature distribution among different types of surfactants, illustrating the model's accuracy and stability based on structural variability and molecular descriptors. This work not only provides valuable insights into the relationship between surfactant molecular structure and CMC but also advances future surfactant design and screening.
What problem does this paper attempt to address?