A unified framework for hard and soft clustering with regularized optimal transport

Jean-Frédéric Diebold,Nicolas Papadakis,Arnaud Dessein,Charles-Alban Deledalle
2024-03-08
Abstract:In this paper, we formulate the problem of inferring a Finite Mixture Model from discrete data as an optimal transport problem with entropic regularization of parameter $\lambda\geq 0$. Our method unifies hard and soft clustering, the Expectation-Maximization (EM) algorithm being exactly recovered for $\lambda=1$. The family of clustering algorithm we propose rely on the resolution of nonconvex problems using alternating minimization. We study the convergence property of our generalized $\lambda-$EM algorithms and show that each step in the minimization process has a closed form solution when inferring finite mixture models of exponential families. Experiments highlight the benefits of taking a parameter $\lambda>1$ to improve the inference performance and $\lambda\to 0$ for classification.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to unify hard clustering and soft clustering within one framework and improve the parameter estimation of the Finite Mixture Model (FMM) through Regularized Optimal Transport (ROT). Specifically, the author proposes a new method based on ROT to solve the following problems: 1. **Unifying hard clustering and soft clustering**: Traditionally, hard clustering such as k - means and soft clustering such as the Expectation - Maximization (EM) algorithm are respectively applicable to different scenarios. The method proposed in this paper can handle these two types of clustering within a single framework by adjusting the regularization parameter \(\lambda\). 2. **Improving clustering performance**: By introducing an entropy regularization term, the author hopes to introduce more flexibility in the parameter estimation process, thereby improving clustering performance, especially in cases where the data distribution is complex or there is noise. 3. **Simplifying computational complexity**: The traditional optimal transport problem has a high computational cost on large - scale datasets. The method proposed in this paper adopts an alternating minimization strategy, so that each step of optimization has a closed - form solution, thereby reducing the computational complexity. 4. **Extension to exponential family models**: The author applies the proposed framework to a wider range of exponential family models, not just Gaussian mixture models, demonstrating its universality and adaptability. ### Specific problem description - **Unification of hard clustering and soft clustering**: By introducing the regularization parameter \(\lambda\), when \(\lambda = 0\), the algorithm degenerates into hard clustering (such as k - means), when \(\lambda = 1\), the algorithm is equivalent to the EM algorithm. For the case of \(\lambda>1\), the inference performance can be further improved, and for the case of \(\lambda \to 0\), it is more suitable for classification tasks. - **Formulation of the optimal transport problem**: Transform the inference problem of FMM into an optimal transport problem with entropy regularization, that is, minimize the following objective function: \[ d(p_\upsilon, q_{\omega,\eta})=\inf_{\pi \in \Pi(\upsilon, \omega)}\langle\gamma, \pi\rangle+\lambda H(\pi) \] where \(\Pi(\upsilon, \omega)\) is the transport polytope, \(\gamma\) is the cost matrix, and \(H(\pi)\) is the entropy regularization term. - **Alternating optimization strategy**: Gradually approximate the local optimal solution by alternately updating the transport plan \(\pi\), weights \(\omega\) and parameters \(\eta\). ### Experimental verification The author verifies the effectiveness of this method through a series of experiments, including the analysis of the inference performance of one - dimensional and two - dimensional Gaussian mixture models, and the influence of different regularization parameters \(\lambda\) on the inference results. The experimental results show that an appropriate value of \(\lambda\) can significantly improve the accuracy and robustness of inference. In conclusion, this paper aims to provide a unified and efficient clustering method by introducing the ROT framework, which is suitable for multiple application scenarios and can flexibly deal with different types of data distributions.