2004 Special Issue A Comparative Investigation on Subspace Dimension Determination

Xuelei Hu,Lei Xu
2004-01-01
Abstract:It is well-known that constrained Hebbian self-organization on multiple linear neural units leads to the same k-dimensional subspace spanned by the first k principal components. Not only the batch PCA algorithm has been widely applied in various fields since 1930s, but also a variety of adaptive algorithms have been proposed in the past two decades. However, most studies assume a known dimension k or determine it heuristically, though there exist a number of model selection criteria in the literature of statistics. Recently, criteria have also been obtained under the framework of Bayesian Ying‐Yang (BYY) harmony learning. This paper further investigates the BYY criteria in comparison with existing typical criteria, including Akaike’s information criterion (AIC), the consistent Akaike’s information criterion (CAIC), the Bayesian inference criterion (BIC), and the cross-validation (CV) criterion. This comparative study is made via experiments not only on simulated data sets of different sample sizes, noise variances, data space dimensions, and subspace dimensions, but also on two real data sets from air pollution problem and sport track records, respectively. Experiments have shown that BIC outperforms AIC, CAIC, and CV while the BYY criteria are either comparable with or better than BIC. Therefore, BYY harmony learning is a more preferred tool for subspace dimension determination by further considering that the appropriate subspace dimension k can be automatically determined during implementing BYY harmony learning for the principal subspace while the selection of subspace dimension k by BIC, AIC, CAIC, and CV has to be made at the second stage based on a set of candidate subspaces with different dimensions which have to be obtained at the first stage of learning. q 2004 Elsevier Ltd. All rights reserved.
What problem does this paper attempt to address?