A Novel Similarity Metric with Application to Big Process Data Analytics

Zijian Guo,Chao Shang,Hao Ye
DOI: https://doi.org/10.1016/j.conengprac.2021.104843
IF: 4.057
2021-01-01
Control Engineering Practice
Abstract:Establishing a quantitative similarity between different datasets has gained prevalence and significance in many applications of process control. In industrial practice, process data are usually multi-dimensional, nonlinearly correlated, and with unknown time-varying distribution, which raise immense challenge for reasonably evaluating similarity. To address this issue, a novel similarity metric based on deep autoencoder (DAE) and the Wasserstein distance is proposed in this paper. Specifically, DAE is used to first capture nonlinear relationship embedded in multivariate process data, and the reconstruction error acts as an indicator to reveal discrepancy between two datasets. After that, the similarity is characterized by evaluating the gap between reconstruction error distributions using Wasserstein distance. The proposed similarity metric has wide applicability in a variety of data analytics tasks including pattern matching, fault diagnosis and mode classifications. Both simulated data and industrial data collected from a real iron-making process are utilized to carry out comprehensive case studies. It is shown that the proposed similarity metric not only enjoys better rationality and sensitivity than generic similarity metrics, but also effectively improves the accuracy of fault diagnosis and mode classification based on big process data.
What problem does this paper attempt to address?