KMCT: K-Means Clustering of Trajectories Efficiently in Location-Based Services
Yuanjun Liu,Guanfeng Liu,Qingzhi Ma,Zhixu Li,Shiting Wen,Lei Zhao,An Liu
DOI: https://doi.org/10.1145/3627673.3679848
2024-01-01
Abstract:With the widespread use of GPS devices and the advancement of location-based services, a vast amount of trajectory data has been collected and mined for various applications. Trajectory clustering, which categorizes trajectories into distinct groups, is the fundamental functionality of trajectory data mining. The challenge is how to cluster on a mass of trajectory data efficiently and universally with satisfying results. The raw trajectory clustering algorithms are universal, but trapped in the dilemma between efficiency and desirable results. Other approaches, such as density-based, road network-based, and deep learning-based algorithms, encounter issues like high time complexity, loss of trajectory integrity, reliance on road networks, and data quality during training. To tackle these challenges, we first propose the efficient KMCT (k-Means Clustering of Trajectories) algorithm based on a semantic interpolation transformation to cluster raw trajectories and achieve satisfying results. Additionally, we introduce the DA-KMCT (Density Accelerated k-Means Clustering of Trajectories) algorithm to further boost the clustering process based on trajectory densities and an optimized centroid selecting strategy. Moreover, we present a novel clustering evaluation method called IOD, which efficiently estimates clustering results on large-scale datasets with linear time complexity. Experimental results on real-world datasets demonstrate that KMCT and DA-KMCT outperform five related methods in terms of clustering quality and time efficiency, and the proposed IOD evaluation shows a strong correlation with the Silhouette Coefficient, offering a reliable and efficient alternative for evaluating clustering results.