Semantic Similarity Based Ontology Cache

Bangyong Liang,Jie Tang,Juanzi Li,Kehong Wang
DOI: https://doi.org/10.1007/11610113_23
2006-01-01
Abstract:This paper addresses the issue of ontology caching on semantic web. The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation. Ontology serves as the metadata for defining the information on semantic web. Ontology based semantic information retrieval (semantic retrieval) is becoming more and more important. Many research and industrial works have been made so far on semantic retrieval. Ontology based retrieval improves the performance of search engine and web mining. In semantic retrieval, a great number of accesses to ontologies usually lead the ontology servers to be very low efficient. To address this problem, it is indeed necessary to cache concepts and instances when ontology server is running. Existing caching methods from database community can be used in the ontology cache. However, they are not sufficient for dealing with the problem. In the task of caching in database, usually the most frequently accessed data are cached and the recently less frequently accessed data in the cache are removed from it. Different from that, in ontology base, data are organized as objects and relations between objects. User may request one object, and then request another object according to a relation of that object. He may also possibly request a similar object that has not any relations to the object. Ontology caching should consider more factors and is more difficult. In this paper, ontology caching is formalized as a problem of classification. In this way, ontology caching becomes independent from any specific semantic web application. An approach is proposed by using machine learning methods. When an object (e.g. concept or instance) is requested, we view its similar objects as candidates. A classification model is then used to predict whether each of these candidates should be cached or not. Features in classification models are defined. Experimental results indicate that the proposed methods can significantly outperform the baseline methods for ontology caching. The proposed method has been applied to a research project that is called SWARMS.
What problem does this paper attempt to address?