Dormant: Defending against Pose-driven Human Image Animation

Jiachen Zhou,Mingsi Wang,Tianlin Li,Guozhu Meng,Kai Chen
2024-09-22
Abstract:Pose-driven human image animation has achieved tremendous progress, enabling the generation of vivid and realistic human videos from just one single photo. However, it conversely exacerbates the risk of image misuse, as attackers may use one available image to create videos involving politics, violence and other illegal content. To counter this threat, we propose Dormant, a novel protection approach tailored to defend against pose-driven human image animation techniques. Dormant applies protective perturbation to one human image, preserving the visual similarity to the original but resulting in poor-quality video generation. The protective perturbation is optimized to induce misextraction of appearance features from the image and create incoherence among the generated video frames. Our extensive evaluation across 8 animation methods and 4 datasets demonstrates the superiority of Dormant over 6 baseline protection methods, leading to misaligned identities, visual distortions, noticeable artifacts, and inconsistent frames in the generated videos. Moreover, Dormant shows effectiveness on 6 real-world commercial services, even with fully black-box access.
Cryptography and Security,Artificial Intelligence,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to prevent the abuse of pose - driven human image animation techniques, thereby protecting personal portrait rights and privacy rights. Specifically, the paper proposes a new method named DORMANT to defend against malicious users generating unauthorized high - quality human videos from a single photo. ### Problem Background Pose - driven human image animation techniques have made great progress and are able to generate vivid and realistic videos from just one photo. However, this technology also brings potential risks: attackers can use publicly available photos to generate videos involving illegal content such as politics and violence, seriously violating the portrait rights and privacy rights of victims. Therefore, an effective protection mechanism is urgently needed to prevent such abuse. ### Goals of DORMANT DORMANT aims to significantly reduce the quality of the generated video by adding protective perturbations to the original image while maintaining the visual similarity between the perturbed image and the original image. Specifically, DORMANT optimizes the perturbation to induce the wrong extraction of appearance features and create inconsistencies between the generated video frames. This will lead to problems such as identity mismatch, visual distortion, obvious artifacts and inter - frame incoherence in the generated video, thus effectively preventing the generation of unauthorized videos. ### Main Contributions 1. **Preventing Unauthorized Image Use**: DORMANT solves the problem of how to effectively prevent unauthorized image use in pose - driven human image animation, protecting personal portrait rights and privacy rights. 2. **Novel Objective Function Design**: A new objective function is proposed to optimize the protective perturbation so that it can induce the wrong extraction of appearance features and create inconsistencies between the generated video frames. 3. **Extensive Experimental Verification**: Through extensive experiments on 8 pose - driven human image animation methods, 4 image - to - image methods, 4 image - to - video methods and 6 commercial services, the effectiveness and transferability of DORMANT are proved. ### Method Overview The core idea of DORMANT is to add small but effective perturbations to the original image to reduce the quality of the generated video. The specific steps include: - **Feature Wrong Extraction**: Induce the wrong extraction of appearance features by maximizing the distance between the features extracted by the VAE encoder, the CLIP image encoder and the ReferenceNet. - **Inter - Frame Inconsistency**: Destroy the temporal consistency of the video by maximizing the distance between the generated video frames. - **Optimization Strategy**: Use the PGD (Projected Gradient Descent) method to iteratively update the perturbation to ensure that the objective function is maximized under the L∞ norm constraint. ### Conclusion DORMANT provides an effective and innovative solution to prevent the abuse of pose - driven human image animation techniques, protecting personal portrait rights and privacy rights. Through extensive experimental verification, its superior performance in multiple generation models and practical applications is proved.