An Integrated Heterogeneous Web Service Retrieval Via Combination of Instance- and Metadata-Based Schema Matching Method.

Jie He,Wei Wang
DOI: https://doi.org/10.1080/10095020.2015.1065075
IF: 4.278
2015-01-01
Geo-spatial Information Science
Abstract:Schema matching is a critical step in the integration of heterogeneous web service, which contains various types of web services and multi-version services of the same type. Mapping loss or mismatch usually occurs due to schema differences in structure and content and the variety in concept definition and organization. Current instance schema matching methods are not mature enough for heterogeneous web service because they cannot deal with the instance data in web service domain and capture all the semantics, especially metadata semantics. The metadata-based and the instance-based matching methods, in the case of being employed individually, are not efficient to determine the concept relationships, which are crucial for finding high-quality matches between schema attributes. In this paper, we propose an improved schema matching method, based on the combination of instance and metadata (CIM) matcher. The main method of our approach is to utilize schema structure, element labels, and the corresponding instance data information. The matching process is divided into two phases. In the first phase, the metadata-based matchers are used to compute the element label similarity of multi-version open geospatial consortium web service schema, and the generated matching results are raw mappings, which will be reused in the next instance matching phase. In the second phase, the designed instance matching algorithms are employed to the instance data of the raw mappings and fine mappings are generated. Finally, the raw mappings and the fine mappings are combined, and the final mappings are obtained. Our experiments are executed on different versions of web coverage service and web feature service instance data deployed in Geoserver. The results indicate that, the CIM method can obtain more accurate matching results and is flexible enough to handle the web service instance data.
What problem does this paper attempt to address?