Nonlinear probabilistic virtual sample generation using Gaussian process latent variable model and fitting for rubber material

Wenlong Chen,Kai Chen
DOI: https://doi.org/10.1016/j.commatsci.2023.112477
IF: 3.572
2023-10-01
Computational Materials Science
Abstract:The development of material informatics has led to an increasingly deep intersection between material science and machine learning (ML). However, limited data volume significantly hampers the application of machine learning models in materials science. To address this challenge, we propose a virtual sample generation algorithm using Gaussian process (VSG-GP) latent variable model and nonlinear fitting for rubber material. The VSG-GP algorithm is versatile, nonlinear, probabilistic, and interpretable. As we known, we are the first to use the Gaussian process latent variable model (GPLVM) and nonlinear fitting to solve the issue of sample generation in material science. Specifically, GPLVM fits the observed few samples to obtain its low-dimensional representation and latent distribution. Furthermore, if the samples contain labels, we use the Gaussian process regression (GPR) to nonlinearly fit the labels. We consider the polymerized styrene butadiene rubber (SBR) dataset and employ the algorithm to generate virtual samples for the SBR. Our investigation shows that the virtual samples follow the same distribution as the original data. The prediction performances of learning models on the virtual samples indicate the value of the labels generated by GPR. Compared with existing methods, experimental results verify the effectiveness and competitiveness of the proposed algorithm. The accuracies of XGBoost and random forest models on the dataset of original and virtual samples are improved by 39% and 43%, respectively.
materials science, multidisciplinary
What problem does this paper attempt to address?