You Still See Me: How Data Protection Supports the Architecture of AI Surveillance

Rui-Jie Yew,Lucy Qin,Suresh Venkatasubramanian
2024-10-07
Abstract:Data forms the backbone of artificial intelligence (AI). Privacy and data protection laws thus have strong bearing on AI systems. Shielded by the rhetoric of compliance with data protection and privacy regulations, privacy-preserving techniques have enabled the extraction of more and new forms of data. We illustrate how the application of privacy-preserving techniques in the development of AI systems--from private set intersection as part of dataset curation to homomorphic encryption and federated learning as part of model computation--can further support surveillance infrastructure under the guise of regulatory permissibility. Finally, we propose technology and policy strategies to evaluate privacy-preserving techniques in light of the protections they actually confer. We conclude by highlighting the role that technologists could play in devising policies that combat surveillance AI technologies.
Computers and Society,Cryptography and Security
What problem does this paper attempt to address?
### Problems the Paper Attempts to Address This paper explores how privacy-preserving technologies (such as multi-party secure computation, federated learning, and homomorphic encryption) in the development of artificial intelligence (AI) systems may support the construction of surveillance infrastructure. Specifically, the authors focus on the following aspects: 1. **Loopholes in Data Protection Regulations**: - Privacy-preserving technologies and compliance rhetoric may be used to circumvent data protection and privacy regulations, allowing for more data collection and processing. - Through these technologies, data subjects may be "invisible" during the development of AI systems, but these systems can ultimately "see" them. 2. **Application of Privacy-Preserving Technologies**: - **Dataset Construction**: Technologies like multi-party secure computation (MPC) can be used for dataset alignment, combining data from different sources to form more granular information. - **Model Training and Computation**: Technologies such as federated learning (FL) and homomorphic encryption (HE) can train models without exposing raw data, but this may also lead to the blurring of data processing responsibilities. - **Model Application**: Even if data subjects are not included in the training data, AI systems may still infer their information through reasoning and other methods. 3. **Enhancement of Surveillance Infrastructure**: - The application of the aforementioned technologies allows high-resource entities to further consolidate their power in surveillance relationships, achieving broader tracking and data analysis. - Surveillance is not limited to digital platforms but may extend to the physical world, such as tracking consumer offline purchasing behavior through ad conversion rate analysis. 4. **Policy and Technical Strategies**: - The authors propose technical and policy strategies to evaluate the actual protective effects of privacy-preserving technologies to prevent their misuse. - Emphasis is placed on the role of security researchers and technologists in formulating relevant regulations, especially in combating surveillance AI technologies. ### Summary This paper aims to reveal how the application of privacy-preserving technologies in AI development may be used to circumvent data protection regulations and support the construction of surveillance infrastructure. By analyzing the application of these technologies in the stages of dataset construction, model training, and model application, the authors demonstrate how these technologies can achieve broader data collection and surveillance under the guise of compliance. Finally, the authors propose a series of technical and policy recommendations to ensure the actual protective effects of these technologies and prevent their misuse.