Abstract:To cluster data that are not linearly separable in the original feature space, <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="1.211ex" height="2.176ex" style="vertical-align: -0.338ex;" viewBox="0 -791.3 521.5 936.9" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMATHI-6B" x="0" y="0"></use></g></svg></span> -means clustering was extended to the kernel version. However, the performance of kernel <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="1.211ex" height="2.176ex" style="vertical-align: -0.338ex;" viewBox="0 -791.3 521.5 936.9" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMATHI-6B" x="0" y="0"></use></g></svg></span> -means clustering largely depends on the choice of the kernel function. To mitigate this problem, multiple kernel learning has been introduced into the <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="1.211ex" height="2.176ex" style="vertical-align: -0.338ex;" viewBox="0 -791.3 521.5 936.9" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMATHI-6B" x="0" y="0"></use></g></svg></span> -means clustering to obtain an optimal kernel combination for clustering. Despite the success of multiple kernel <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="1.211ex" height="2.176ex" style="vertical-align: -0.338ex;" viewBox="0 -791.3 521.5 936.9" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMATHI-6B" x="0" y="0"></use></g></svg></span> -means clustering in various scenarios, few of the existing work update the combination coefficients based on the diversity of kernels, which leads to the result that the selected kernels contain high redundancy and would degrade the clustering performance and efficiency. We resolve this problem from the perspective of subset selection in this article. In particular, we first propose an effective strategy to select a diverse subset from the prespecified kernels as the representative kernels, and then incorporate the subset selection process into the framework of multiple <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="1.211ex" height="2.176ex" style="vertical-align: -0.338ex;" viewBox="0 -791.3 521.5 936.9" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMATHI-6B" x="0" y="0"></use></g></svg></span> -means clustering. The representative kernels can be indicated as a significant combination weights. Due to the nonconvexity of the obtained objective function, we develop an alternating minimization method to optimize the combination coefficients of the selected kernels and the cluster membership alternatively. In particular, an efficient optimization method is developed to reduce the time complexity of optimizing the kernel combination weights. Finally, extensive experiments on benchmark and real-world data sets demonstrate the effectiveness and superiority of our approach in comparison with existin- methods.<svg xmlns="http://www.w3.org/2000/svg" style="display: none;"><defs id="MathJax_SVG_glyphs"><path stroke-width="1" id="MJMATHI-6B" d="M121 647Q121 657 125 670T137 683Q138 683 209 688T282 694Q294 694 294 686Q294 679 244 477Q194 279 194 272Q213 282 223 291Q247 309 292 354T362 415Q402 442 438 442Q468 442 485 423T503 369Q503 344 496 327T477 302T456 291T438 288Q418 288 406 299T394 328Q394 353 410 369T442 390L458 393Q446 405 434 405H430Q398 402 367 380T294 316T228 255Q230 254 243 252T267 246T293 238T320 224T342 206T359 180T365 147Q365 130 360 106T354 66Q354 26 381 26Q429 26 459 145Q461 153 479 153H483Q499 153 499 144Q499 139 496 130Q455 -11 378 -11Q333 -11 305 15T277 90Q277 108 280 121T283 145Q283 167 269 183T234 206T200 217T182 220H180Q168 178 159 139T145 81T136 44T129 20T122 7T111 -2Q98 -11 83 -11Q66 -11 57 -1T48 16Q48 26 85 176T158 471L195 616Q196 629 188 632T149 637H144Q134 637 131 637T124 640T121 647Z"></path></defs></svg>

CPI-model-based analysis of sparse k-means clustering algorithms

A Novel Kernel Possibitistic Fuzzy C-Means Clustering Algorithm For Large Scale Data Sets

Accelerating spherical K-means clustering for large-scale sparse document data

Speeding Up K-Means Clustering in High Dimensions by Pruning Unnecessary Distance Computations

Subspace Clustering by Directly Solving Discriminative K-means

On Simplifying Large-Scale Spatial Vectors: Fast, Memory-Efficient, and Cost-Predictable k-means

K-Means Clustering with KNN and Mean Imputation on CPU Benchmark Compilation Data

Comparative Analysis of Optimization Strategies for K-means Clustering in Big Data Contexts: A Review

Faster K-Means Cluster Estimation

Enabling Highly Efficient K -Means Computations on the SW26010 Many-Core Processor of Sunway TaihuLight

A robust and sparse K-means clustering algorithm

On the Efficiency of K-Means Clustering: Evaluation, Optimization, and Algorithm Selection

Performance analysis of Kmeans with modified initial centroid selection algorithms and developed Kmeans9+ model

K*-Means: An Efficient Clustering Algorithm with Adaptive Decision Boundaries

Computing $k$-means in mixed precision

Bilateral k-Means Algorithm for Fast Co-Clustering.

K-MACE and Kernel K-MACE Clustering

r-Reference points based k-means algorithm

Relax and Merge: A Simple Yet Effective Framework for Solving Fair $k$-Means and $k$-sparse Wasserstein Barycenter Problems

Multiple Kernel k -Means Clustering by Selecting Representative Kernels

A Reconfigurable 64-Dimension K-Means Clustering Accelerator with Adaptive Overflow Control