DescribeCtx

Shao Yang,Yuehan Wang,Yuan Yao,Haoyu Wang,Yanfang (Fanny) Ye,Xusheng Xiao
DOI: https://doi.org/10.1145/3510003.3510058
2022-01-01
Abstract:While mobile applications (i.e., apps) are becoming capable of handling various needs from users, their increasing access to sensitive data raises privacy concerns. To inform such sensitive behaviors to users, existing techniques propose to automatically identify explanatory sentences from app descriptions; however, many sensitive behaviors are not explained in the corresponding app descriptions. There also exist general techniques that translate code to sentences. However, these techniques lack the vocabulary to explain the uses of sensitive data and fail to consider the context (i.e., the app functionalities) of the sensitive behaviors. To address these limitations, we propose DescribeCtx, a context-aware description synthesis approach that trains a neural machine translation model using a large set of popular apps, and generates app-specific descriptions for sensitive behaviors. Specifically, DescribeCtx encodes three heterogeneous sources as input, i.e., vocabularies provided by privacy policies, behavior summary provided by the call graphs in code, and contextual information provided by GUI texts. Our evaluations on 1,262 Android apps show that, compared with existing baselines, DescribeCtx produces more accurate descriptions (24.96 in BLEU) and achieves higher user ratings with respect to the reference sentences manually identified in the app descriptions.
What problem does this paper attempt to address?