Trust No Bot: Discovering Personal Disclosures in Human-LLM Conversations in the Wild

Niloofar Mireshghallah,Maria Antoniak,Yash More,Yejin Choi,Golnoosh Farnadi
2024-07-20
Abstract:Measuring personal disclosures made in human-chatbot interactions can provide a better understanding of users' AI literacy and facilitate privacy research for large language models (LLMs). We run an extensive, fine-grained analysis on the personal disclosures made by real users to commercial GPT models, investigating the leakage of personally identifiable and sensitive information. To understand the contexts in which users disclose to chatbots, we develop a taxonomy of tasks and sensitive topics, based on qualitative and quantitative analysis of naturally occurring conversations. We discuss these potential privacy harms and observe that: (1) personally identifiable information (PII) appears in unexpected contexts such as in translation or code editing (48% and 16% of the time, respectively) and (2) PII detection alone is insufficient to capture the sensitive topics that are common in human-chatbot interactions, such as detailed sexual preferences or specific drug use habits. We believe that these high disclosure rates are of significant importance for researchers and data curators, and we call for the design of appropriate nudging mechanisms to help users moderate their interactions.
Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to measure the leakage of personal privacy information in the interaction between humans and chatbots, in order to improve the understanding of users' AI literacy and promote privacy research on large - language models (LLMs). Specifically, the paper explores the context and frequency of these information disclosures by analyzing the personal information disclosure, especially the leakage of personally identifiable information (PII) and sensitive information, between real - users and commercial GPT models. The author raises the following main questions: 1. **What types of sensitive information are shared in human - machine dialogues?** Through a fine - grained analysis of naturally occurring conversations, the paper identifies the types of personally identifiable information and other sensitive information disclosed by users when interacting with chatbots, such as detailed sexual preferences or specific drug - use habits. 2. **What is the frequency of such information leakage and how reliably can we detect them?** The author has developed a classification method for automatically labeling and classifying tasks and sensitive topics in 5,000 conversations extracted from the WildChat dataset, and evaluates its accuracy by manually verifying some of the labeled results. 3. **In which contexts (tasks) will different types and frequencies of sensitive information be shared?** Through an analysis of the sharing of sensitive information in different task contexts, the author has discovered some unexpected leakage scenarios, such as the leakage of PII also occurring in translation or code - editing tasks. The main contribution of the paper lies in an in - depth exploration of various private and sensitive information that users may disclose during their exchanges with chatbots, and proposes a new classification system to capture this information and the context in which it is shared. In addition, the author also emphasizes the limitations of current PII detection systems, pointing out that many sensitive information cannot be identified by traditional PII detection methods, such as explicit sexual content and job applications. These findings are of great significance for chatbot designers and LLM researchers, calling for the development of more effective user - privacy protection mechanisms.