An ontology based semantic heterogeneity measurement framework for optimization in distributed data mining

Bin Liu,Shu-Gui Cao,Dong-Fang Cao,Qing-Chun Li,Hai-Tao Liu,Shao-Nan Shi
DOI: https://doi.org/10.1109/ICMLC.2012.6358897
2012-01-01
Abstract:In distributed data mining (DDM) systems, the semantic heterogeneity between data sources has not got universal attentions, which may produce the potential risks of damaging the quality of the final result. This paper presents a semantic distance measurement framework to extract the essential semantic heterogeneity between data sources. In this framework, an ontology-matching based multi-strategy voting method is utilized to comprehensively synthesize the semantic distances between two data source ontologies in element level and structure level. The output of the framework can be leveraged as the foundation to group the data sources for optimizing the DDM result. Finally, the framework is integrated into a DDM architecture we have proposed.
What problem does this paper attempt to address?