Keyword Combination Extraction in Text Categorization Based on Ant Colony Optimization
Zi-jun Yu,Wei-gang Wu,Jing Xiao,Jun Zhang,Rui-Zhang Huang,Ou Liu
DOI: https://doi.org/10.1109/SoCPaR.2009.90
2009-01-01
Abstract:Due to the increasing number of documents in digital form, the automated text categorization (TC) has become more and more promising in the last ten years. A TC system can automatically assign a document with the most suitable category, but the reason for such an assignment is usually unknown by users. To make the TC system be interpretable, it is necessary to select a group of keywords, or termed a keyword combination, to describe each text category. In this paper, we propose a novel algorithm, keyword combination extraction based on ant colony optimization (KCEACO), to search the optimal keyword combination of a target category. By extending the traditional feature selection techniques, an evaluation function is designed for evaluating a keyword combination. This function takes into account the relationships among different keywords. Experimental results show that KCEACO can efficiently find the optimal keyword combination from a large number of candidate combinations.