Abstract:As an important part of machine learning, clustering methods have been continuously paid attention to. Current clustering methods divide data objects usually based on Euclidean metric, which is a basic and effective metric method. However, with the high dimensionality of data and the diversification of data representation, the complexity of the spatial structure of real-world data continues to rise. Classical clustering methods face many challenges such as insufficient clustering effectiveness, the sensitivity of clustering method parameters, and lack of stability of clustering results. Aiming at the above problems, this paper designs a non-Euclidean metric and constructs a multi-granularity staged clustering method based on the metric. First of all, this paper uses the sequential relationship of each feature of the data to construct a similarity measure between objects from the perspective of positive and negative granularity to improve the clustering algorithm’s understanding of complex spatial structure data. Secondly, this paper designs the attenuation-diffusion pattern divides and conquers according to the distribution characteristics of data objects in different patterns, and uses the heuristic idea to effectively cluster the data in stages from local to global. Again, based on the above, this paper proposes a clustering method based on multi-positive-negative granularity and attenuation-diffusion pattern, which can effectively deal with the challenges brought by complex spatial structure data to clustering methods. Finally, the effectiveness and robustness of the proposed method and advanced clustering methods are compared and analyzed on UCI real data sets. Experimental results show that the method proposed in this paper has obvious advantages in clustering results on complex spatial structure data. In addition, in the two directions of non-Euclidean metrics and multi-granularity clustering, the method proposed in this paper provides a new perspective for effectively dealing with the design of clustering methods on complex spatial structure data.

Research on a Text Data Preprocessing Method Suitable for Clustering Algorithm

Research on Psychology Data Clustering Algorithm Based on CUDA

A Combined Data Preprocessing Method Based on K-means Clustering and Singular Spectrum Analysis

Comparison of Spectral Clustering, K-clustering and Hierarchical Clustering on E-Nose Datasets: Application to the Recognition of Material Freshness, Adulteration Levels and Pretreatment Approaches for Tomato Juices

A Linguistic Feature Based Text Clustering Method.

Solutions to General Clustering Algorithmic Issues

A Statistical Information-Based Clustering Approach in Distance Space

A Text Clustering Algorithm to Detect Basic Level Categories in Texts

A New Text Clustering Method Using Hidden Markov Model

A Novel Text Clustering Algorithm Based on Inner Product Space Model of Semantic

Enhancing Web Text Clustering Accuracy and Efficiency With a Maximum Entropy Function Model: Overcoming High-Dimensional and Directional Challenges

Algorithm and Experiment Research of Textual Document Clustering Based on Improved K-means

A Semantic approach for effective document clustering using WordNet

An end-to-end Neural Network Framework for Text Clustering

An Evaluation on Feature Selection for Text Clustering

Text clustering based on pre-trained models and autoencoders

Research on K-Value Selection Method of K-Means Clustering Algorithm

Enhancing Multi-Layer Perceptron Performance with K-Means Clustering

A Clustering Method Based on Multi-Positive–negative Granularity and Attenuation-Diffusion Pattern

A Clustering Algorithm for Multi-Modal Heterogeneous Big Data With Abnormal Data

TCUAP: A Novel Approach of Text Clustering Using Asymmetric Proximity.