PurExt: Automated Extraction of the Purpose-Aware Rule from the Natural Language Privacy Policy in IoT

Lu Yang,Xingshu Chen,Yonggang Luo,Xiao Lan,Li Chen
DOI: https://doi.org/10.1155/2021/5552501
IF: 1.968
2021-01-01
Security and Communication Networks
Abstract:The extensive data collection performed by the Internet of Things (IoT) devices can put users at risk of data leakage. Consequently, IoT vendors are legally obliged to provide privacy policies to declare the scope and purpose of the data collection. However, complex and lengthy privacy policies are unfriendly to users, and the lack of a machine-readable format makes it difficult to check policy compliance automatically. To solve these problems, we first put forward a purpose-aware rule to formalize the purpose-driven data collection or use statement. Then, a novel approach to identify the rule from natural language privacy policies is proposed. To address the issue of diversity of purpose expression, we present the concepts of explicit and implicit purpose, which enable using the syntactic and semantic analyses to extract purposes in different sentences. Finally, the domain adaption method is applied to the semantic role labeling (SRL) model to improve the efficiency of purpose extraction. The experiments that are conducted on the manually annotated dataset demonstrate that this approach can extract purpose-aware rules from the privacy policies with a high recall rate of 91%. The implicit purpose extraction of the adapted model significantly improves the F1-score by 11%.
What problem does this paper attempt to address?