Entity Resolution On Cloud

Hongzhi Wang
DOI: https://doi.org/10.4018/978-1-4666-5198-2.ch010
2014-01-01
Abstract:Large quantities of records need to be read and analyzed in cloud computing; many records referring to the same entity bring challenges for data processing and analysis. Entity resolution has become one of the hot issues in database research. Clustering based on records similarity is one of most commonly used methods, but the existing methods of computing records similarity often cost much time and are not suitable for cloud computing. This chapter shows that it is necessary to use wave of strings to compute records similarity in cloud computing and provides a method based on wave of strings of entity resolution. Theoretical analysis and experimental results show that the method proposed in this chapter is correct and effective.
What problem does this paper attempt to address?