Dynamic Vector-Space Model for Internet Textual Information Categorization

张晓辉,李莹,常桂然,赵宏
DOI: https://doi.org/10.1109/icsmc.2002.1173453
2002-01-01
Abstract:This paper proposes a kind of Dynamic Vector-Space Model @VSM) for Internet text information categorization. The static and traditional VSM cannot really represent the real-time characteristics of Internet information. So, in order to improve the precision and recall rate in a dynamic test collection from Internet, DVSM maps multiple discriminating words into one dimension of the document vector and is capable of incremental re-learning. Beeause we use the statistical distribution patterns of discriminating words as the dimension, discriminating words can be moved from one dimension to another when topics of the Internet articles shift. Our experiment compared the DVSM with the traditional VSM. This paper presents that DVSM outperforms the traditional VSM significantly over time. Keyworuk Dynamic Vector-Space Model, Internet Textual Information, Text Categorization, distribution pattern
What problem does this paper attempt to address?