Improved k-means clustering method for codebook generation

Chunhui Chunhui,Wang Ying,Masahide Kaneko
DOI: https://doi.org/10.3969/j.issn.0254-3087.2012.10.032
2012-01-01
Abstract:Generally,the k-means clustering method is applied to generate the codebook in bag of word(BoW) model.However,the performance of the k-means clustering method greatly depends on the selection of original centers,which result in less robust codebook.Moreover,the distance between the center point and data point needs to be calculated in each iteration,which leads to high calculation complexity.Aiming at this problem,an improved k-means clustering method based on optimized selection of the original center is proposed,which overcomes the influence of randomly selected original center on clustering performance.Triangle inequality is used to simplify the calculation,which makes the generated codebook more robust and makes calculation less complex.At last,a weight contribution based codebook representation method is introduced and the BoW model based on the improved codebook is applied to image categorization,which improves the categorization result.The experiments on Caltech 101 and Caltech 256 databases were carried out,which proves the effectiveness of the proposed method.The effect of codebook size on categorization accuracy is analyzed.The results show that using the proposed method the categorization accuracy is improved by 5% to 8%.
What problem does this paper attempt to address?