Gaussian Mixture Models with Rare Events.

Xuetong Li,Jing Zhou,Hansheng Wang
2024-01-01
Journal of Machine Learning Research
Abstract:We study here a Gaussian Mixture Model (GMM) with rare events data. In thiscase, the commonly used Expectation-Maximization (EM) algorithm exhibitsextremely slow numerical convergence rate. To theoretically understand thisphenomenon, we formulate the numerical convergence problem of the EM algorithmwith rare events data as a problem about a contraction operator. Theoreticalanalysis reveals that the spectral radius of the contraction operator in thiscase could be arbitrarily close to 1 asymptotically. This theoretical findingexplains the empirical slow numerical convergence of the EM algorithm with rareevents data. To overcome this challenge, a Mixed EM (MEM) algorithm isdeveloped, which utilizes the information provided by partially labeled data.As compared with the standard EM algorithm, the key feature of the MEMalgorithm is that it requires additionally labeled data. We find that MEMalgorithm significantly improves the numerical convergence rate as comparedwith the standard EM algorithm. The finite sample performance of the proposedmethod is illustrated by both simulation studies and a real-world dataset ofSwedish traffic signs.
What problem does this paper attempt to address?