Abstract:This paper addresses the clustering of data in the hyperdimensional computing (HDC) domain. In prior work, an HDC-based clustering framework, referred to as HDCluster, has been proposed. However, the performance of the existing HDCluster is not robust. The performance of HDCluster is degraded as the hypervectors for the clusters are chosen at random during the initialization step. To overcome this bottleneck, we assign the initial cluster hypervectors by exploring the similarity of the encoded data, referred to as \textit{query} hypervectors. Intra-cluster hypervectors have a higher similarity than inter-cluster hypervectors. Harnessing the similarity results among query hypervectors, this paper proposes four HDC-based clustering algorithms: similarity-based k-means, equal bin-width histogram, equal bin-height histogram, and similarity-based affinity propagation. Experimental results illustrate that: (i) Compared to the existing HDCluster, our proposed HDC-based clustering algorithms can achieve better accuracy, more robust performance, fewer iterations, and less execution time. Similarity-based affinity propagation outperforms the other three HDC-based clustering algorithms on eight datasets by 2~38% in clustering accuracy. (ii) Even for one-pass clustering, i.e., without any iterative update of the cluster hypervectors, our proposed algorithms can provide more robust clustering accuracy than HDCluster. (iii) Over eight datasets, five out of eight can achieve higher or comparable accuracy when projected onto the hyperdimensional space. Traditional clustering is more desirable than HDC when the number of clusters, $k$, is large.

Principal component analysis based clustering for high-dimension, low-sample-size data

A Report on Multilinear PCA Plus Multilinear LDA to Deal with Tensorial Data: Visual Classification As an Example

Hierarchical disjoint principal component analysis

Enhanced Locality Sensitive Clustering in High Dimensional Space

Principal component analysis and clustering on manifolds

Improved Algorithms for High-Dimensional Robust Pca

Robust PCA for High Dimensional Data based on Characteristic Transformation

Semi-supervised Hierarchical Clustering Analysis for High Dimensional Data

K-means clustering via principal component analysis

Limitations of Clustering with PCA and Correlated Noise

Clustering of high-dimensional observations

Degree-heterogeneous Latent Class Analysis for High-dimensional Discrete Data

Robust Principal Component Analysis via Discriminant Sample Weight Learning

Diagonally-Dominant Principal Component Analysis

Dynamic Principal Component Analysis in High Dimensions

Principal Ellipsoid Analysis (PEA): Efficient non-linear dimension reduction & clustering

Robust Classification of High-Dimensional Data using Data-Adaptive Energy Distance

A Covariance-Free Iterative Principal Component Analysis for High Dimensional and Large Scale Data

Dynamic Principal Subspaces in High Dimensions

Sparse principal component analysis via regularized low rank matrix approximation

Robust Clustering using Hyperdimensional Computing