Abstract:This paper addresses the clustering of data in the hyperdimensional computing (HDC) domain. In prior work, an HDC-based clustering framework, referred to as HDCluster, has been proposed. However, the performance of the existing HDCluster is not robust. The performance of HDCluster is degraded as the hypervectors for the clusters are chosen at random during the initialization step. To overcome this bottleneck, we assign the initial cluster hypervectors by exploring the similarity of the encoded data, referred to as \textit{query} hypervectors. Intra-cluster hypervectors have a higher similarity than inter-cluster hypervectors. Harnessing the similarity results among query hypervectors, this paper proposes four HDC-based clustering algorithms: similarity-based k-means, equal bin-width histogram, equal bin-height histogram, and similarity-based affinity propagation. Experimental results illustrate that: (i) Compared to the existing HDCluster, our proposed HDC-based clustering algorithms can achieve better accuracy, more robust performance, fewer iterations, and less execution time. Similarity-based affinity propagation outperforms the other three HDC-based clustering algorithms on eight datasets by 2~38% in clustering accuracy. (ii) Even for one-pass clustering, i.e., without any iterative update of the cluster hypervectors, our proposed algorithms can provide more robust clustering accuracy than HDCluster. (iii) Over eight datasets, five out of eight can achieve higher or comparable accuracy when projected onto the hyperdimensional space. Traditional clustering is more desirable than HDC when the number of clusters, $k$, is large.

Hypercube-Based High-Dimensional Index Using Co-clustering

Accelerating Exact Nearest Neighbor Search in High Dimensional Euclidean Space Via Block Vectors

A High Dimensional Index Based on Relative Distance Hashing Method

Composite Distance Transformation for Indexing and K -Nearest-neighbor Searching in High-Dimensional Spaces

Enhanced Locality Sensitive Clustering in High Dimensional Space

Indexing High-Dimensional Data in Dual Distance Spaces

A Clustered Dwarf Structure to Speed Up Queries on Data Cubes

Novel High-Dimensional Indexing Structure Based on Dual-Distance Metric

An Adaptive And Efficient Dimensionality Reduction Algorithm For High-Dimensional Indexing

Ipoc: A Polar Coordinate Based Indexing Method For Nearest Neighbor Search In High Dimensional Space

Indexing high-dimensional data in dual distance spaces: a symmetrical encoding approach

PHC: A Rapid Parallel Hierarchical Cubing Algorithm on High Dimensional OLAP

Efficient index-based KNN join processing for high-dimensional data

High-dimensional hierarchical OLAP:A prefix- index hierarchical cubing approach

A Similarity Indexing Algorithm Based on Each Dimension Clustering

LDC: Enabling Search By Partial Distance In A Hyper-Dimensional Space

Robust Clustering using Hyperdimensional Computing

SRS: solving c-approximate nearest neighbor queries in high dimensional euclidean space with a tiny index

Preserving-Ignoring Transformation Based Index for Approximate k Nearest Neighbor Search

Efficient Approximate Algorithms for the Closest Pair Problem in High Dimensional Spaces.