A Dimensionality Reduction and Reconstruction Method for Data with Multiple Connected Components
Yuqin Yao,Yang Gao,Zhiguo Long,Hua Meng,Michael Sioutis
DOI: https://doi.org/10.1109/BDAI56143.2022.9862787
2022-01-01
Abstract:In the literature on dimensionality reduction, including Spectral Clustering and Laplacian Eigenmaps, one of the core ideas is to reconstruct data based on similarities between data points, which makes the choice of similarity matrices a key factor on the performance of a dimensionality reduction model. Traditional methods like K-nearest neighbor, is an element of-neighbor, and Gaussian Kernel for constructing similarity matrices based on data distribution characteristics have been extensively studied. However, these methods usually focus on only a specific level of the data when considering the similarity between data points, which might result in a great flaw in data reconstruction when data possess hierarchical and multiple groups structure. Specifically, such methods can only characterize the similarity between data within a group, but ignore the similarity between different groups. To overcome this deficiency, this paper proposes a hierarchical way of similarity matrix construction, by introducing strong, weak, and intra- and inter-cluster similarities to describe relations between multiple levels. The proposed method can better adapt to complex data with multiple connected components, and the effectiveness of it is verified in a series of experiments on synthetic and real-world datasets.