Abstract:Cluster analysis is a method that identifies similar groups of data without any prior knowledge of the relevant groups. One of the most widely used clustering methods is model-based clustering, in which data clustering is performed by fitting a probabilistic model to the data. Mixture of Gaussian distributions is a commonly used model in model-based clustering. Unfortunately, the number of covariance matrices parameters rapidly increases by increasing the number of variables or components in these models. So far, various classes of the parsimonious Gaussian mixture models, by applying various constraints on the covariance matrices, have been introduced to solve this problem. Unfortunately, the number of models in each of these classes is so small such that in practice it does not allow the study and selection of models with any number of parameters, which can vary between the minimum number (one parameter) and the maximum number (no constraints model) of parameters. In this paper, to deal with this problem a family of the parsimonious Gaussian mixture models is introduced. This is done by identifying and determining the appropriate partitions of the variances and correlation coefficients between variables among clusters. We call these models "the parsimonious Gaussian mixture models with partitioned parameters". The generalized Expectation-Conditional Maximization algorithm, by employing the Fisher scoring method within the algorithm, is used to compute the maximum likelihood estimates of parameters. Bayesian information criterion is used for comparing and selecting the best model. Also, the steepest ascent method is adapted to search the best model. Finally, performances of these models are examined on two real datasets and a brief simulation study.

Clustering with the multivariate normal inverse Gaussian distribution

Infinite mixtures of multivariate normal-inverse Gaussian distributions for clustering of skewed data

A Bayesian approach for clustering skewed data using mixtures of multivariate normal-inverse Gaussian distributions

Clustering of non-Gaussian data by variational Bayes for normal inverse Gaussian mixture models

Variational Bayes Approximations for Clustering via Mixtures of Normal Inverse Gaussian Distributions

Model-based clustering and classification using mixtures of multivariate skewed power exponential distributions

Clustering using skewed multivariate heavy tailed distributions with flexible tail behaviour

Model-based Clustering with Sparse Covariance Matrices

Model-based clustering based on sparse finite Gaussian mixtures

The parsimonious Gaussian mixture models with partitioned parameters and their application in clustering

Estimating the mean and variance of a high-dimensional normal distribution using a mixture prior

A new model for natural groupings in high-dimensional data

A parsimonious family of multivariate Poisson-lognormal distributions for clustering multivariate count data

A Nonparametric Model for Multi-Manifold Clustering with Mixture of Gaussians and Graph Consistency

Finite Mixtures of Multivariate Poisson-Log Normal Factor Analyzers for Clustering Count Data

Clustering Three-Way Data with Outliers

Model-based clustering via skewed matrix-variate cluster-weighted models

Clustering Gaussian Graphical Models

Clustering Multivariate Data using Factor Analytic Bayesian Mixtures with an Unknown Number of Components

Finding Outliers in Gaussian Model-based Clustering

Modelling Skewed and Heavy-tailed Data Using a Normal Weighted Inverse Gaussian Distribution