A Novel Scalable Semi-supervised GMM and Its Application for Multimode Process Quality Prediction with Big Data

Le Yao,Zhiqiang Ge,Weiming Shao,Zhihuan Song
DOI: https://doi.org/10.1109/ddcls.2018.8516121
2018-01-01
Abstract:In this paper, a novel variational inference semi-supervised GMM (VI-S(2)GMM) model is firstly proposed for multimode process predictive modeling with semi-supervised data. Since all the labeled and unlabeled data samples are involved in each iteration of parameter updating, an intractable computing problem occurs when facing a high-dimension and large-scale dataset. To tack this problem, a scalable Stochastic Variational Inference semi-supervised GMM (SVI-S(2)GMM) is further proposed for massive semi-supervised data. Through taking advantage of stochastic gradient optimization algorithm to maximize the Evidence of Lower Bound (ELBO), the VI-based algorithm becomes scalable. In SVI-S(2)GMM, only one or a mini-batch of samples is randomly selected to update parameters in each iteration, which is more efficient than VI-S(2)GMM. In this way, a large number of unlabeled process data can be useful in the modeling, which will benefit the parameter identification. The SVI-S(2)GMM is then exploited for the prediction of quality-related key performance index (KPI). Two modeling cases with large scale of semi-supervised datasets demonstrate the feasibility and effectiveness of the proposed algorithms.
What problem does this paper attempt to address?