Tagging Web Product Titles Based on Hidden Markov Model

Peng Wang,Baowen Xu,Yue You,Lu Chen
DOI: https://doi.org/10.1145/1982185.1982197
2011-01-01
Abstract:E-commerce web sites usually have to maintain a large number of product information. To organize this product information, a feasible way is to add semantic tags to the information. However, the Web product information often consists of many irregular statements published by users. Therefore, it is difficult to find rules to automatically tag the product information. This paper mainly focus on the problem of tagging Web product titles and proposes a tagging method based on the hidden markov model (HMM). This method first trains HMM with the maximum likelihood (ML) algorithm, then employs the Viterbi algorithm to tag product titles. Moreover, some strategies including smoothing process, background knowledge, extraction rules and simplifying HMM output observations are used for improving the quality of results. Experimental results on the real world dataset show that our method can achieve more than 51% precision and 60% recall.
What problem does this paper attempt to address?