Ontology-Based Data Quality Management for Data Streams

Sandra Geisler,Christoph Quix,Sven Weber,Matthias Jarke
DOI: https://doi.org/10.1145/2968332
2016-10-13
Journal of Data and Information Quality
Abstract:Data Stream Management Systems (DSMS) provide real-time data processing in an effective way, but there is always a tradeoff between data quality (DQ) and performance. We propose an ontology-based data quality framework for relational DSMS that includes DQ measurement and monitoring in a transparent, modular, and flexible way. We follow a threefold approach that takes the characteristics of relational data stream management for DQ metrics into account. While (1) Query Metrics respect changes in data quality due to query operations, (2) Content Metrics allow the semantic evaluation of data in the streams. Finally, (3) Application Metrics allow easy user-defined computation of data quality values to account for application specifics. Additionally, a quality monitor allows us to observe data quality values and take counteractions to balance data quality and performance. The framework has been designed along a DQ management methodology suited for data streams. It has been evaluated in the domains of transportation systems and health monitoring.
What problem does this paper attempt to address?