Domain Specific Semantic Validation of Schema.org Annotations

Umutcan Şimşek,Elias Kärle,Omar Holzknecht,Dieter Fensel
DOI: https://doi.org/10.48550/arXiv.1706.06384
2017-09-15
Abstract:Since its unveiling in 2011, <a class="link-external link-http" href="http://schema.org" rel="external noopener nofollow">this http URL</a> has become the de facto standard for publishing semantically described structured data on the web, typically in the form of web page annotations. The increasing adoption of <a class="link-external link-http" href="http://schema.org" rel="external noopener nofollow">this http URL</a> facilitates the growth of the web of data, as well as the development of automated agents that operate on this data. <a class="link-external link-http" href="http://Schema.org" rel="external noopener nofollow">this http URL</a> is a large heterogeneous vocabulary that covers many domains. This is obviously not a bug, but a feature, since <a class="link-external link-http" href="http://schema.org" rel="external noopener nofollow">this http URL</a> aims to describe almost everything on the web, and the web is huge. However, the heterogeneity of <a class="link-external link-http" href="http://schema.org" rel="external noopener nofollow">this http URL</a> may cause a side effect, which is the challenge of picking the right classes and properties for an annotation in a certain domain, as well as keeping the annotation semantically consistent. In this work, we introduce our rule based approach and an implementation of it for validating <a class="link-external link-http" href="http://schema.org" rel="external noopener nofollow">this http URL</a> annotations from two aspects: (a) the completeness of the annotations in terms of a specified domain, (b) the semantic consistency of the values based on pre-defined rules. We demonstrate our approach in the tourism domain.
Information Retrieval
What problem does this paper attempt to address?