Scalable Multiple Kernel K -Means Clustering.

Yihang Lu,Haonan Xin,Rong Wang,Feiping Nie,Xuelong Li
DOI: https://doi.org/10.1145/3511808.3557690
2022-01-01
Abstract:With its simplicity and effectiveness, k -means is immensely popular, but it cannot perform well on complex nonlinear datasets. Multiple kernel k -means (MKKM) demonstrates the ability to describe highly complex nonlinear separable data structures. However, its speed requirement cannot scale as well as the data size grows beyond tens of thousands. Nowadays, digital data explosion mandates more scalable clustering methods to assist the machine learning tasks in easy-to-access form. To address the issue, we propose to employ the Nystrom scheme for MKKM clustering, termed scalable multiple kernel k -means clustering. It significantly reduces the computational complexity by replacing the original kernel matrix with a low-rank approximation. Analytically and empirically, we demonstrate that our method performs as well as existing state-of-the-art methods, but at a significantly lower compute cost, allowing us to scale the method more effectively for clustering tasks.
What problem does this paper attempt to address?