Attribute-preserving Face Dataset Anonymization via Latent Code Optimization

Simone Barattin,Christos Tzelepis,Ioannis Patras,Nicu Sebe
2023-03-21
Abstract:This work addresses the problem of anonymizing the identity of faces in a dataset of images, such that the privacy of those depicted is not violated, while at the same time the dataset is useful for downstream task such as for training machine learning models. To the best of our knowledge, we are the first to explicitly address this issue and deal with two major drawbacks of the existing state-of-the-art approaches, namely that they (i) require the costly training of additional, purpose-trained neural networks, and/or (ii) fail to retain the facial attributes of the original images in the anonymized counterparts, the preservation of which is of paramount importance for their use in downstream tasks. We accordingly present a task-agnostic anonymization procedure that directly optimizes the images' latent representation in the latent space of a pre-trained GAN. By optimizing the latent codes directly, we ensure both that the identity is of a desired distance away from the original (with an identity obfuscation loss), whilst preserving the facial attributes (using a novel feature-matching loss in FaRL's deep feature space). We demonstrate through a series of both qualitative and quantitative experiments that our method is capable of anonymizing the identity of the images whilst -- crucially -- better-preserving the facial attributes. We make the code and the pre-trained models publicly available at: <a class="link-external link-https" href="https://github.com/chi0tzp/FALCO" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### The Problem the Paper Attempts to Solve This paper aims to address the issue of face identity anonymization in image datasets to ensure the privacy of the depicted individuals while maintaining the dataset's usefulness for downstream tasks (such as training machine learning models). Specifically, the authors point out two main drawbacks of existing methods: 1. **The need for additional training of purpose-specific neural networks**: This increases cost and complexity. 2. **The inability to retain facial attributes in the original images**: This is crucial because these attributes are essential for downstream tasks (such as expression recognition, mental health analysis, etc.). To address these issues, the authors propose a new anonymization method that directly handles image anonymization by optimizing the latent representation of a pre-trained Generative Adversarial Network (GAN). Through this method, the authors ensure that while the identity is obfuscated, facial attributes are better preserved. Specifically, the authors use a novel feature matching loss function (in the deep feature space of FaRL) to retain facial attributes and an identity confusion-based loss function to ensure that the distance between the new identity and the original identity reaches the desired level.