Abstract:The hidden Markov model (HMM) is widely utilized in natural language processing, speech recognition, autonomous vehicular systems, and healthcare for tasks such as clustering, pattern recognition, predictive modeling, anomaly detection, and time-series forecasting. However, HMMs can be sensitive to initial states, compromising clustering reliability. To address this issue, we propose an innovative integration of an HMM with hybrid distance metric learning and a modified Bayesian Gaussian mixture model (BGMM) to enhance clustering performance and robustness. A significant challenge in HMM applications is determining the optimal number of hidden states. We address this using a k-fold cross-validation strategy. Implementing our Bayesian Gaussian Hidden Markov Mixture Clustering Model (BGH2MCM) on five diverse datasets, we categorize the observed data sequences according to underlying hidden state sequences. This approach yields superior outcomes to conventional techniques such as K-means, agglomerative clustering, density-based spatial clustering of applications with noise (DBSCAN), and the BGMM. We evaluate the efficiency of our model using silhouette, Davies–Bouldin, and Calinski–Harabasz scores, accuracy metrics, and computation time. Our results demonstrate that the BGH2MCM consistently achieves better clustering quality and computational efficiency, showing an average computation time 23% lower than agglomerative clustering with HMM, 22% less than DBSCAN with HMM, and 14% lower than K-means with the HMM and a BGMM-HMM across all datasets. This study highlights the potential of our BGH2MCM to improve data mining and knowledge discovery practices from complex, real-world datasets.

Hidden Markov models with multivariate bounded asymmetric student's t-mixture model emissions

Regime Switching Model Estimation: Spectral Clustering Hidden Markov Model

Semisupervised Robust Modeling of Multimode Industrial Processes for Quality Variable Prediction Based on Student's T Mixture Model.

The Infinite Student'S T-Mixture For Robust Modeling

The Student's T -Hidden Markov Model with Truncated Stick-Breaking Priors.

A Bayesian multilevel hidden Markov model with Poisson-lognormal emissions for intense longitudinal count data

Multimode process data modeling: A Dirichlet process mixture model based Bayesian robust factor analyzer approach

Variational Bayesian Variable Selection for High-Dimensional Hidden Markov Models

Hidden Markov Models Based on Generalized Dirichlet Mixtures for Proportional Data Modeling

Topological Hidden Markov Models

hmmTMB: Hidden Markov models with flexible covariate effects in R

Modelling financial time series based on heavy-tailed market microstructure models with scale mixtures of normal distributions

Mixture of Coupled HMMs for Robust Modeling of Multivariate Healthcare Time Series

Hidden Markov Model Framework Using Independent Component Analysis Mixture Model

Conditional Density Estimation with HMM Based Support Vector Machines

Statistical inference for the nonparametric and semiparametric hidden Markov model via the composite likelihood approach

Enhanced Bayesian Gaussian hidden Markov mixture clustering for improved knowledge discovery

Spatial-based Bayesian Hidden Markov Models with Dirichlet Mixtures for Video Anomaly Detection

A novel finite mixture model based on the generalized scale mixtures of asymmetric generalized normal distributions: properties, estimation methodology and applications

Bayesian Approximations to Hidden Semi-Markov Models for Telemetric Monitoring of Physical Activity

Finite Mixture Models with Student t Distributions: an Applied Example