PSpec: a formal specification language for fine-grained control on distributed data analytics.

Chen Luo,Fei He,Dong Yan,Dan Zhang,Xin Zhou,Bow-Yaw Wang
DOI: https://doi.org/10.1109/ICSE-C.2017.120
2017-01-01
Abstract:Organizations often share business data with third-parties to perform data analytics. However, the business data may contain a lot of customers' private information. One major concern of these organizations is thus to ensure such private information is properly used. In this paper, we present PSpec, a formal language for specifying data usage restrictions in distributed data analytics. Compared with previous works, PSpec specializes in data analytics and provides explicit support for data desensitization and association to balance data privacy and utility. We moreover present redundancy and conflict analysis algorithms to help data owners write PSpec privacy policies. To evaluate PSpec we carry out a case study on TPC-DS benchmark. The results demonstrate applicability and practicality of the PSpec language.
What problem does this paper attempt to address?