Approach for Name Ambiguity Problem Using a Multiple-Layer Clustering

Wenrong Jiang,Anbao Wang,Cuihong Wu,Jian Chen,Jihong Yan
DOI: https://doi.org/10.1109/CSE.2009.110
2009-01-01
Abstract:Name ambiguity refers to the problem of attributing a publication to a proper author. This is a common issue in digital library. It is a difficult problem as the same author's name may be written in different ways and different authors may share the same name. In this paper, we examine a multiple-layer clustering approach which is based on a limited amount of associated information with each publication. It combines the Package-Merge algorithm, pattern-matching extraction methods, as well as a fuzzy logic rule based concept. This experimental study uses the DBLP collection as a case study, and the three attributes used are email addresses, the co-authorship relationship and paper title similarity. Our experiments show that this approach can distinguish authors and classify papers on the test dataset more accurately than the previous studies.
What problem does this paper attempt to address?