A comparative study of feature weighting methods for document co-clustering.

Yunming Ye,Xutao Li,Biao Wu,Yan Li
DOI: https://doi.org/10.1504/IJITCC.2011.039286
2011-01-01
Abstract:Document clustering is an important task in data mining. Co-clustering has become one of state-of-the-art methods for this task. In this paper, we propose a feature weighting co-clustering algorithm for document co-clustering and present a comparative study on how different weighting methods affect its performance. The compared feature weighting approaches include inverse document frequency-based methods, information theory-based methods and term variance-based methods. The comparison results on benchmark data sets show that the mutual information weighting method can lead to better performance for the proposed algorithm than other weighting schemes.
What problem does this paper attempt to address?