How Android Apps Break the Data Minimization Principle: an Empirical Study

Shaokun Zhang,Hanwen Lei,Yuanpeng Wang,Ding Li,Yao Guo,Xiangqun Chen
DOI: https://doi.org/10.1109/ase56229.2023.00141
2024-01-01
Abstract:The Data Minimization Principle is crucial for protecting individual privacy. However, existing Android runtime permissions do not guarantee this principle. Moreover, the lack of an automatic enforcement mechanism leads to uncertainty as to whether apps strictly comply with this principle. To bridge this gap, we conduct the first systematic empirical study on violations of the Data Minimization Principle and design a new enforcement tool called GUIMind to detect them. GUIMind first utilizes a reinforcement learning model to explore app activities and monitor access to sensitive APIs that require sensitive permissions, and then it leverages an existing tool to detect such violations. We evaluate the performance of GUIMind using 120 real-world Android apps. The results indicate that GUIMind can achieve a detection accuracy of 96.1%, effectively accelerating the empirical study. Our empirical research is mainly focused on the prevalence of violations, the responses of administrators to violations, and the potential factors and characteristics that lead to violations, such as typical violations, app categories, and personal data types. Our study reveals that 83.5% of apps contain at least one privacy violation, with health apps being the most severe. In addition, telephony information is the most commonly leaked personal data type, accounting for 71.1%. Finally, we randomly selected 60 non-compliant apps for reporting to the administrator, whose responses confirm the effectiveness of our approach.
What problem does this paper attempt to address?