Abstract:Classical clustering algorithms typically either lack an underlying probability framework to make them predictive or focus on parameter estimation rather than defining and minimizing a notion of error. Recent work addresses these issues by developing a probabilistic framework based on the theory of random labeled point processes and characterizing a Bayes clusterer that minimizes the number of misclustered points. The Bayes clusterer is analogous to the Bayes classifier. Whereas determining a Bayes classifier requires full knowledge of the feature-label distribution, deriving a Bayes clusterer requires full knowledge of the point process. When uncertain of the point process, one would like to find a robust clusterer that is optimal over the uncertainty, just as one may find optimal robust classifiers with uncertain feature-label distributions. Herein, we derive an optimal robust clusterer by first finding an effective random point process that incorporates all randomness within its own probabilistic structure and from which a Bayes clusterer can be derived that provides an optimal robust clusterer relative to the uncertainty. This is analogous to the use of effective class-conditional distributions in robust classification. After evaluating the performance of robust clusterers in synthetic mixtures of Gaussians models, we apply the framework to granular imaging, where we make use of the asymptotic granulometric moment theory for granular images to relate robust clustering theory to the application.

Bayesian Clustering with Variable and Transformation Selections

Bayesian mixtures of common factor analyzers: Model, variational inference, and applications

Bayesian approaches to variable selection in mixture models with application to disease clustering

Simultaneous Bayesian Clustering and Model Selection with Mixture of Robust Factor Analyzers

A Bayesian Approach to Clustering Matting Components in Spectral Matting

Bayesian Bi-clustering Methods with Applications in Computational Biology

Unsupervised Joint Alignment and Clustering using Bayesian Nonparametrics

Flexible Variable Selection for Clustering and Classification

Bayesian Nonparametric Graph Clustering

Sparse Bayesian Hierarchical Modeling of High-dimensional Clustering Problems

A Bayesian approach for clustering and exact finite-sample model selection in longitudinal data mixtures

Bayesian clustering of mixed-type data with relevant variable identification

Clustering Multivariate Data using Factor Analytic Bayesian Mixtures with an Unknown Number of Components

Optimal Clustering under Uncertainty

Optimal Bayesian estimators for latent variable cluster models

Revisiting k-means: New Algorithms via Bayesian Nonparametrics

Bayesian Nonparametric Clustering with Feature Selection for Spatially Resolved Transcriptomics Data

A nonparametric variable clustering model

An interpretable Bayesian clustering approach with feature selection for analyzing spatially resolved transcriptomics data

Penalized model-based clustering with cluster-specific diagonal covariance matrices and grouped variables

Bayesian Statistical Inference for Factor Analysis Models with Clustered Data