A Weighted Method to Improve the Centroid-based Classifier

Chuan Liu,Wen-yong Wang,Guang-hui Tu,Nan-nan Liu,Yu Xiang
DOI: https://doi.org/10.12783/dtetr/iceea2016/6704
2017-01-01
DEStech Transactions on Engineering and Technology Research
Abstract:Centroid-Based Classifier (CBC) is one of the most widely used text classification method due to its theoretical simplicity and computational efficiency. However, the accuracy of CBC is not satisfactory when it deals with the skewed distributed data. In this paper, we propose a new classification model named as Gravitation Model (GM) to solve the model misfit of CBC. In the proposed model, we give each category a mass factor to indicate its distribution in vector space and this factor can be learned from training data. We provide the performance comparisons with CBC and its improved methods based on the results of experiments conducted on twelve real datasets, which show that the proposed gravitation model consistently outperforms CBC. Furthermore, it reaches the same performance as the best centroid-based classifier and is more stable than the best one.
What problem does this paper attempt to address?