A method for searching for a globally optimal k-partition of higher-dimensional datasets
Kristian Sabo,Rudolf Scitovski,Šime Ungar,Zoran Tomljanović
DOI: https://doi.org/10.1007/s10898-024-01372-6
2024-02-14
Journal of Global Optimization
Abstract:The problem of finding a globally optimal k -partition of a set is a very intricate optimization problem for which in general, except in the case of one-dimensional data, i.e., for data with one feature ( ), there is no method to solve. Only in the one-dimensional case, there are efficient methods based on the fact that the search for a globally optimal k -partition is equivalent to solving a global optimization problem for a symmetric Lipschitz-continuous function using the global optimization algorithm DIRECT . In the present paper, we propose a method for finding a globally optimal k -partition in the general case ( , ), generalizing an idea for solving the Lipschitz global optimization for symmetric functions. To do this, we propose a method that combines a global optimization algorithm with linear constraints and the k -means algorithm. The first of these two algorithms is used only to find a good initial approximation for the k -means algorithm. The method was tested on a number of artificial datasets and on several examples from the UCI Machine Learning Repository, and an application in spectral clustering for linearly non-separable datasets is also demonstrated. Our proposed method proved to be very efficient.
mathematics, applied,operations research & management science