Towards Privacy Engineering for Real-Time Analytics in the Human-Centered Internet of Things

Thomas Plagemann,Vera Goebel,Matthias Hollick,Boris Koldehofe
DOI: https://doi.org/10.48550/arXiv.2210.16352
2022-10-29
Abstract:Big data applications offer smart solutions to many urgent societal challenges, such as health care, traffic coordination, energy management, etc. The basic premise for these applications is "the more data the better". The focus often lies on sensing infrastructures in the public realm that produce an ever-increasing amount of data. Yet, any smartphone and smartwatch owner could be a continuous source of valuable data and contribute to many useful big data applications. However, such data can reveal a lot of sensitive information, like the current location or the heart rate of the owner of such devices. Protection of personal data is important in our society and for example manifested in the EU General Data Protection Regulation (GDPR). However, privacy protection and useful big data applications are hard to bring together, particularly in the human-centered IoT. Implementing proper privacy protection requires skills that are typically not in the focus of data analysts and big data developers. Thus, many individuals tend to share none of their data if in doubt whether it will be properly protected. There exist excellent privacy solutions between the "all or nothing" approach. For example, instead of continuously publishing the current location of individuals one might aggregate this data and only publish information of how many individuals are in a certain area of the city. Thus, personal data is not revealed, while useful information for certain applications like traffic coordination is retained. The goal of the Parrot project is to provide tools for real-time data analysis applications that leverage this "middle ground". Data analysts should only be required to specify their data needs, and end-users can select the privacy requirements for their data as well as the applications and end-users they want to share their data with.
Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?