Efficient Model Selection for Mixtures of Probabilistic PCA Via Hierarchical BIC.

Jianhua Zhao
DOI: https://doi.org/10.1109/tcyb.2014.2298401
IF: 11.8
2014-01-01
IEEE Transactions on Cybernetics
Abstract:This paper concerns model selection for mixtures of probabilistic principal component analyzers (MPCA). The well known Bayesian information criterion (BIC) is frequently used for this purpose. However, it is found that BIC penalizes each analyzer implausibly using the whole sample size. In this paper, we present a new criterion for MPCA called hierarchical BIC in which each analyzer is penalized using its own effective sample size only. Theoretically, hierarchical BIC is a large sample approximation of variational Bayesian lower bound and BIC is a further approximation of hierarchical BIC. To learn hierarchical-BIC-based MPCA, we propose two efficient algorithms: two-stage and one-stage variants. The two-stage algorithm integrates model selection with respect to the subspace dimensions into parameter estimation, and the one-stage variant further integrates the selection of the number of mixture components into a single algorithm. Experiments on a number of synthetic and real-world data sets show that: 1) hierarchical BIC is more accurate than BIC and several related competitors and 2) the two proposed algorithms are not only effective but also much more efficient than the classical two-stage procedure commonly used for BIC.
What problem does this paper attempt to address?