Metadata-based Author Name Disambiguation
Charles Smith,Michael,Wagner,Richard Taylor,Rakesh,Kumar,Robert,Williams,Ajay Kumar,Rashid Ali,Hui Fang,Robert Moore,Jie Tang,David Jensen,Gang Wu,William Cohen,Paul Jones,Joseph Miller,Robert Fisher,Dmitry Pavlov
2018-01-01
Abstract:Distinguishing similar entities is contingent upon the quantum of information available about them: the more the information the easier it is to distinguish them and vice versa. Thus, attributes or metadata play an important role in distinguishing various entities and grouping or separating them. In this paper, we propose a name disambiguation mechanism that uses multiple publication attributes (metadata), including author names, publication venues, titles, etc. for solving the problem. Generally, any algorithm dealing with name ambiguity performs two major operations: determining similarity between two entities; and clustering entities on the basis of their similarity. The proposed algorithm is a hybrid clustering mechanism which uses hard clustering in the first stage and then uses soft clustering in the second to cluster the citation records. Experimental results prove the efficiency of the proposed name disambiguation mechanism. It has been observed experimentally that soft clustering is capable of handling split-citation problem to a great extent. Keywords—Digital libraries, hybrid clustering, name disambiguation, soft computing, ambiguity