DRSCDM: A Novel Density-Related Clustering for Complex High-Dimensional Data Streams

Dongdong Li,Yihan Fan,Zhe Wang
DOI: https://doi.org/10.1109/tcsvt.2024.3435383
IF: 5.859
2024-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:The proliferation of high-dimensional complex data in various fields such as multimedia, social media, and sensor networks has led to an increasing demand for real-time clustering algorithms. This article presents a novel two-stage approach for complex data streams. In the online stage, angular margin are introduced to constrain the mapping of input data, enhancing the directional characteristics of the resulting data representation. In the offline stage, we propose a unique clustering approach grounded in angular density to uncover spatial relationships within the data. This approach utilizes two distinct strategies for angular density clustering. Neighbor Selection based on Angular Relations define the angular density, which significantly enhances the algorithm’s discriminative ability. Density-Priority Cluster Selection strategy determines the generation of clusters, ensuring the reliability of clustering. We also introduce a novel data expiration mechanism that optimizes computational costs and memory usage by discarding data objects from stable clusters. Experimental evaluations on four diverse datasets, including speaker diarization and video face clustering tasks, demonstrate the superior performance of our proposed method over state-of-the-art online clustering techniques. Furthermore, our method achieves comparable performance to offline clustering methods, highlighting its effectiveness and efficiency in real-time clustering applications. The source code for the proposed algorithms is accessible at https://github.com/sssssuda/DRSCDM.
What problem does this paper attempt to address?