CU-COMSEM: Exploring Rich Features for Unsupervised Web Per- sonal Name Disambiguation

Ying Chen,James Martin
DOI: https://doi.org/10.3115/1621474.1621498
2007-01-01
Abstract:The increasing number of web sources is exacerbating the named-entity ambiguity problem. This paper explores the use of various token-based and phrase-based fea- tures in unsupervised clustering of web pages containing personal names. From these experiments, we find that the use of rich features can significantly improve the disambiguation performance for web per- sonal names.
What problem does this paper attempt to address?