Abstract:In many databases, science bibliography database for example, name attribute is the most commonly chosen identifier to identify entities. However, names are often ambiguous and not always unique which cause problems in many fields. Name disambiguation is a non-trivial task in data management that aims to properly distinguish different entities which share the same name, particularly for large databases like digital libraries, as only limited information can be used to identify authors’ name. In digital libraries, ambiguous author names occur due to the existence of multiple authors with the same name or different name variations for the same person. Also known as name disambiguation, most of the previous works to solve this issue often employ hierarchical clustering approaches based on information inside the citation records, e.g. co-authors and publication titles. In this paper, we focus on proposing a robust hybrid name disambiguation framework that is not only applicable for digital libraries but also can be easily extended to other application based on different data sources. We propose a web pages genre identification component to identify the genre of a web page, e.g. whether the page is a personal homepage. In addition, we propose a re-clustering model based on multidimensional scaling that can further improve the performance of name disambiguation. We evaluated our approach on known corpora, and the favorable experiment results indicated that our proposed framework is feasible.

Two birds with one stone: a graph-based framework for disambiguating and tagging people names in web search.

GRAPE: A Graph-Based Framework for Disambiguating People Appearances in Web Search

GRAPE: a system for disambiguating and tagging people names in web search.

A Unified Framework for Name Disambiguation

On Graph-Based Name Disambiguation.

ADANA: Active Name Disambiguation

Robust hybrid name disambiguation framework for large databases

A Constraint-Based Probabilistic Framework for Name Disambiguation

Bootstrapped Grouping of Results to Ambiguous Person Name Queries

Name Disambiguation Using Web Connection

Learning to Name Faces

A Unified Probabilistic Framework for Name Disambiguation in Digital Library

Multi-document Chinese Name Disambiguation Based on Latent Semantic Analysis

Author Name Disambiguation Based on Heterogeneous Graph

Person Resolution in Person Search Results: WebHawk.

Name disambiguation using many-to-one features

Name Disambiguation By Collective Classification

Chinese multi-document personal name disambiguation

Bridging the Semantic Gap Between Image Contents and Tags

Name Disambiguation Using Atomic Clusters.

Social Network Based Cross-Document Personal Name Disambiguation