Incremental and decremental subspace learning

Tongfeng Sun,Zihui Ren,Shifei Ding
2011-01-01
Journal of Computational Information Systems
Abstract:Incremental Principal component analysis (IPCA) has been verified as an effective method of subspace learning. It can deal with data streams, but it can't deal with the cases that samples are removed. Incremental and Decremental Principal Component Analysis (IDPCA) can resolve the cases that samples are added or removed according to actual situation. Instead of having to retraining across all the sample data whenever samples are changed, the eigen-axes (eigenvectors) are chosen from candidate space, which is established according to original space and samples. Based on the variances of sample projections, IDPCA adopts as few eigenvectors as possible to represent the whole sample space. It is verified that IDPCA is accurate and can get principal eigenvectors keeping a designated eigenvalue accumulation ratio for the energy spanned by the eigenvectors in the whole sample space even after many incremental or decremental updates. The results of experiments show that IDPCA is stable and can achieve better performances even if training samples are added or removed in any size of chunk after many times. Compared to PCA, IDPCA has higher computation efficiency, nearly the same classification results and good eigenvector accuracy. Compared to previous Chunk IPCA, IDPCA can overcome the problems that the samples are removed and maintain less number of eigen-axes to get the designated accumulation ratio. When all the removed samples are less than 67 percent of total learned samples, eigen-axes of IDPCA can be approximately updated without the memory of previous training data set and still keeps the subspace accurate. © 2011 Binary Information Press December, 2011.
What problem does this paper attempt to address?