A multi-source heterogeneous spatial big data fusion method based on multiple similarity and voting decision

Zeqiu Chen,Jianghui Zhou,Ruizhi Sun
DOI: https://doi.org/10.1007/s00500-022-07734-0
IF: 3.732
2022-12-24
Soft Computing
Abstract:Data fusion is an efficient way to achieve an improved accuracy and more specific inferences by fusing and aggregating data from different sensors. However, due to the increasing complexity of spatial data with massive and multi-source heterogeneous characteristics, the existing methods cannot satisfy quite well the requirement for the integrity of data and the accuracy of fusion results in some specific situations. By considering the geographical properties of spatial data, a multi-source heterogeneous spatial big data fusion method based on multiple similarity and voting decision (SDFSV) is proposed in this paper, which develops a three-step record linking algorithm to improve the quality of entity recognition for the incremental fusion of massive data. Then, a one-time voting algorithm is introduced into the proposed method, so that the data conflicts can be significantly reduced and thus the accuracy of the data fusion can be improved. And a relation deduction method based on rule and entity recognition is presented to enhance the data integrity. In addition, in order to promote traceability and interpretability of fusion results, it is necessary to construct a data traceability mechanism. Experimental results show that SDFSV has an improved performance by using the data of Beijing Medical Institutions collected from 10 data sources.
computer science, artificial intelligence, interdisciplinary applications
What problem does this paper attempt to address?