Latent Subspace Clustering based on Deep Neural Networks

Yanjun Ma,Chao Shang,Fan Yang,Dexian Huang
2014-01-01
Abstract:Clustering approaches have been widely used in process control community for unsupervised classification beneficial for further analysis, modeling and optimization. Process data generally involve far more dimensions than needed; this phenomenon is called as ”data rich but information poor” and becomes obstacles for reasonable classification. Therefore, it is desirable to use latent variable models such as principal component analysis (PCA) to lower the dimension of data. Traditional clustering models, however, are directly established on the data and make no allowance for latent subspace, which would cause inaccuracy in unsupervised data classification. In recent years deep neural networks (DNN) have proved effective for developing latent variable models, which is termed as the “deep learning” technique. In this paper, we propose a novel clustering approach based on a combination of DNN and traditional K-means method. DNN is responsible for latent subspace description within the data, and the K-means method is used for clustering in the derived latent subspace. The proposed method has better generalization performance due to its strong nonlinear representation ability, and it is especially favored in the case of high-dimensional data with significant correlations. The efficacy of the proposed method is addressed on two benchmark data sets in comparison with traditional clustering approaches.
What problem does this paper attempt to address?