Data Inconsistency Evaluation for Cyberphysical System

Hao Wang,Jianzhong Li,Hong Gao
DOI: https://doi.org/10.1177/155014779496878
IF: 1.938
2016-01-01
International Journal of Distributed Sensor Networks
Abstract:Cyberphysical systems (CPSs) have been widely applied in a variety of applications to collect data, while data is often dirty in reality. We pay attention to the way of evaluating data inconsistency which is a major concern for evaluating quality of data and its source. This paper is the first study on data inconsistency evaluation problem for CPS based on conditional functional dependencies. Given a database instance D including n tuples and a CFD set Σ including r CFDs, data inconsistency is defined as the ratio of the size of minimum culprit in D , where a culprit is a set of tuples leading to integrity errors. Firstly, we give a sufficient analysis on the complexity and inapproximability of minimum culprit problem. Then, we provide a practical algorithm that gives a 2-approximation of the data dirtiness in O ( r n log ⁡ n ) time based on independent residual subgraph . To deal with the large dynamic data, we provide a compact structure based on B-tree for storing independent residual subgraph in order to update inconsistency efficiently. At last, we test our algorithm on both synthetic and real-life datasets; the experiment results show the scalability of our algorithm and the quality of the evaluation result.
What problem does this paper attempt to address?