Privacy Issues and Data Protection in Big Data: A Case Study Analysis under GDPR

Nils Gruschka,Vasileios Mavroeidis,Kamer Vishi,Meiko Jensen
DOI: https://doi.org/10.48550/arXiv.1811.08531
2018-11-21
Abstract:Big data has become a great asset for many organizations, promising improved operations and new business opportunities. However, big data has increased access to sensitive information that when processed can directly jeopardize the privacy of individuals and violate data protection laws. As a consequence, data controllers and data processors may be imposed tough penalties for non-compliance that can result even to bankruptcy. In this paper, we discuss the current state of the legal regulations and analyze different data protection and privacy-preserving techniques in the context of big data analysis. In addition, we present and analyze two real-life research projects as case studies dealing with sensitive data and actions for complying with the data regulation laws. We show which types of information might become a privacy risk, the employed privacy-preserving techniques in accordance with the legal requirements, and the influence of these techniques on the data processing phase and the research results.
Cryptography and Security
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to protect personal privacy and comply with data protection regulations in big - data analysis, especially the General Data Protection Regulation (GDPR). With the development of big - data technology, organizations are able to collect and process a large amount of data, which may contain sensitive information. If not properly handled, it may directly threaten personal privacy and violate data protection laws. Therefore, data controllers and data processors may face severe penalties for non - compliance and may even lead to bankruptcy. To address these issues, the paper discusses the current state of legal provisions and analyzes the application of different data protection and privacy - protection technologies in big - data analysis. In addition, through two actual research project cases, it shows how to meet the requirements of data regulations in research projects dealing with sensitive data and the impact of these privacy - protection technologies on the data - processing stage and research results. Specifically, the paper focuses on the following aspects: - **Legal Provisions**: It mainly introduces the EU General Data Protection Regulation (GDPR), which has been in effect since May 2018 and applies to all organizations in the EU and the European Economic Area, as well as organizations in other regions that process data of European citizens. - **Privacy - Protection Technologies**: It discusses a variety of privacy - protection technologies, such as anonymization, pseudo - anonymization, k - anonymity, l - diversity, t - closeness and differential privacy, etc. - **Case Studies**: Through two actual research projects - "Oslo Analytics" and "SWAN", it analyzes in detail how to comply with the requirements of GDPR when dealing with sensitive data and the specific impact of these technologies on data processing and research results. Overall, the paper aims to explore how to effectively protect personal privacy in big - data analysis while ensuring compliance with relevant laws and regulations.