A unified framework for hard and soft clustering with regularized optimal transport

Jean-Frédéric Diebold,Nicolas Papadakis,Arnaud Dessein,Charles-Alban Deledalle
2024-03-08
Abstract:In this paper, we formulate the problem of inferring a Finite Mixture Model from discrete data as an optimal transport problem with entropic regularization of parameter $\lambda\geq 0$. Our method unifies hard and soft clustering, the Expectation-Maximization (EM) algorithm being exactly recovered for $\lambda=1$. The family of clustering algorithm we propose rely on the resolution of nonconvex problems using alternating minimization. We study the convergence property of our generalized $\lambda-$EM algorithms and show that each step in the minimization process has a closed form solution when inferring finite mixture models of exponential families. Experiments highlight the benefits of taking a parameter $\lambda>1$ to improve the inference performance and $\lambda\to 0$ for classification.
Machine Learning
What problem does this paper attempt to address?