Name Disambiguation Using Atomic Clusters.

Feng Wang,Juanzi Li,Jie Tang,Jing Zhang,Kehong Wang
DOI: https://doi.org/10.1109/waim.2008.96
2008-01-01
Abstract:Name ambiguity is a critical problem in many applications, in particular in the online bibliography systems, such as DBLP and CiteSeer. Previously, several clustering based methods have been proposed although, the problem still presents to be a big challenge for both research and industry communities. In this paper, we present a complementary study to the problem from another point of view. We propose an approach of finding atomic clusters to improve the performance of existing clustering-based methods. We conducted experiments on a dataset from a real-world system: Arnetminer.org. Experiments results show that significant improvements can be obtained by using the proposed atomic clusters finding approach (about +8% and +27% improvements depending on different clustering methods).
What problem does this paper attempt to address?