A machine learning approach for corrosion small datasets

Totok Sutojo,Supriadi Rustad,Muhamad Akrom,Abdul Syukur,Guruh Fajar Shidik,Hermawan Kresno Dipojono
DOI: https://doi.org/10.1038/s41529-023-00336-7
2023-03-18
npj Materials Degradation
Abstract:In this work, we developed a QSAR model using the K-Nearest Neighbor (KNN) algorithm to predict the corrosion inhibition performance of the inhibitor compound. To overcome the small dataset problems, virtual samples are generated and added to the training set using a Virtual Sample Generation (VSG) method. The generalizability of the proposed KNN + VSG model is verified by using six small datasets from references and comparing their prediction performances. The research shows that for the six datasets, the proposed model is able to make predictions with the best accuracy. Adding virtual samples to the training data helps the algorithm recognize feature-target relationship patterns, and therefore increases the number of chemical quantum parameters correlated with corrosion inhibition efficiency. This proposed method strengthens the prospect of ML for developing material designs, especially in the case of small datasets.
materials science, multidisciplinary
What problem does this paper attempt to address?