Dartgrid : a Semantic Web Toolkit for Integrating Heterogeneous Relational Databases
Zhaohui Wu,Huajun Chen,Heng Wang,Yimin Wang,Yuxin Mao,Jinmin Tang,Cunyin Zhou
2006-01-01
Abstract:Since most of the data in big organization is stored in relational databases, for semantic web to be really useful and successful, great efforts are required to offer methods and tools to support integration of heterogeneous relational databases using semantic web technologies. Dartgrid3 4 5. is an application development framework together with a set of practical semantic tools to facilitate the integration of heterogenous relational databases using semantic web technologies. It greatly facilitate developers (i) to interconnect distributed located legacy databases using richer semantics, (ii) to provide ontology-based query, search and navigation services as one huge distributed database, and (iii) to add additional deductive capabilities on the top to increase the usability and reusability of data. A set of practical semantic web tools has been developed. For examples, DartMapping is a visualized mapping tool to help DBA in defining semantic mappings from heterogeneous relational schemas to RDF/OWL ontologies. DartQuery is an ontologybased query interface enabling user to specify semantic queries, and able to rewrite SPARQL semantic queries to a set of SQL queries for query rewriting. DartSearch is an ontology-based search engine enabling user to make full-text search over all databases and to navigate across the search results semantically. It is also enriched with a concept ranking mechanism to enable user to find more accurate and reliable results. We have developed and deployed such kind of a semantic web application for China Academy of Traditional Chinese Medicine (CATCM). It semantically interconnects over 70 legacy TCM databases by a formal TCM ontology with over 70 classes and 800 properties. In this application, the TCM ontology acts as a separate semantic layer to fill up the gaps among legacy databases with heterogeneous structures. Users and machines only need to interact with the semantic layer, and the semantic interconnections allow them to start in one database, and then move around an extendable set of databases. The semantic layer also enables the system to answer semantic queries across