Toward Detection of Aliases Without String Similarity

Ning An,Lili Jiang,Jianyong Wang,Ping Luo,Min Wang,Bing Nan Li
DOI: https://doi.org/10.1016/j.ins.2013.11.010
IF: 8.1
2014-01-01
Information Sciences
Abstract:Entity aliases commonly exist. Accurately detecting these aliases plays a vital role in various applications. In particular, it is critical to detect the aliases that are intentionally hidden from the real identities, such as those of terrorists and frauds. Most existing work does not pay close attention to the aliases that have low/no string similarity to the given entities. In this paper, we propose a classifier that is based on active learning for detecting this type of aliasing. To minimize the cost of pair-wise comparison, a subset-based method is designed to restrict the selection within entity subsets. An active learning classifier is then employed in each entity subset to find the probability of whether a candidate is the alias of a given entity within the subset. After all of the results from the classifier are integrated, a list of aliases is returned for each given entity. For evaluation, we implemented four state-of-the-art methods and compared them with our proposed approach on three datasets. The results clearly demonstrate that this new active learning classifier is superior to those existing methods.
What problem does this paper attempt to address?