Abstract:Kernel clustering has the ability to get the inherent nonlinear structure of the data. But the high computational complexity and the unknown representation of the kernel space make it unavailable for the data clustering in distributed peer-to-peer (P2P) networks. To solve this issue, we propose a new series of random feature-based collaborative kernel clustering algorithms in this article. In the most basic algorithm, each node in a distributed P2P network first maps its data into a low-dimensional random feature space with the approximation of the given kernel by using the random Fourier feature mapping method. Then, each node independently searches the clusters with its local data and the collaborative knowledge from its neighbor nodes, and the distributed clustering is performed among all network nodes until reaching the global consensus result, i.e., all nodes have the same cluster centers. In addition, an improved version is designed with assignment of feature weights, which is optimized by the maximum-entropy technique to extract important features for the cluster identification. What's more, to relief the impact of different kernel functions and related parameters on clustering results, the combination of multiple kernels rather than a single kernel is adopted for the low-dimensional approximation, and the optimized weights are assigned to provide the guidance on the choice of the kernels and their parameters and discover significant features at the same time. Experiments on synthetic and real-world datasets show that the proposed methods achieve similar and even better results than the traditional kernel clustering methods on various performance metrics, including the average classification rate, the average normalized mutual information, and the average adjusted rand index. More importantly, the low-dimensional random features approximated to kernels and the distributed clustering mechanism adopted in these methods bring the greatly lowe- temporal complexity.

Frequent term based peer-to-peer text clustering

Keyword Extraction Based Peer Clustering

Modeling and Performance Analysis of Unstructured P2P Network

Transfer Collaborative Fuzzy Clustering in Distributed Peer-to-Peer Networks

Random Feature-Based Collaborative Kernel Fuzzy Clustering for Distributed Peer-to-Peer Networks

A distributed approach to node clustering in decentralized peer-to-peer networks

Clustering Text Data Streams

An Effective Approach Based On Rough Set And Topic Cluster To Build Peer Communities

Text clustering based on term weights automatic partition

An Efficient Architecture for Information Retrieval in P2P Context Using Hypergraph

Distributed Information Theoretic Clustering

P2P Data Dissemination for Real-Time Streaming Using Load-Balanced Clustering Infrastructure in MANETs With Large-Scale Stable Hosts

Dynamic Clustering-Based Query Answering in Peer-to-Peer Systems

Efficient Information Retrieval in Mobile Peer-to-peer Networks

Constraint-Driven Type-2 Fuzzy C-Means Clustering and Step-wise Gossip for Fusion Transmission in Distributed Networks

Concept chain based text clustering

A Peer-to-Peer DHT Algorithm Based on Small-World Network

An Efficient Clustering Algorithm for Small Text Documents

Searching Techniques in P2P Instant Messaging System.

Popularity Based Network Statistical Analysis in Peer-to-peer Application

Towards Effective Clustered Federated Learning: A Peer-to-peer Framework with Adaptive Neighbor Matching