UP-FacE: User-predictable Fine-grained Face Shape Editing

Florian Strohm,Mihai Bâce,Andreas Bulling
2024-07-12
Abstract:We present User-predictable Face Editing (UP-FacE) -- a novel method for predictable face shape editing. In stark contrast to existing methods for face editing using trial and error, edits with UP-FacE are predictable by the human user. That is, users can control the desired degree of change precisely and deterministically and know upfront the amount of change required to achieve a certain editing result. Our method leverages facial landmarks to precisely measure facial feature values, facilitating the training of UP-FacE without manually annotated attribute labels. At the core of UP-FacE is a transformer-based network that takes as input a latent vector from a pre-trained generative model and a facial feature embedding, and predicts a suitable manipulation vector. To enable user-predictable editing, a scaling layer adjusts the manipulation vector to achieve the precise desired degree of change. To ensure that the desired feature is manipulated towards the target value without altering uncorrelated features, we further introduce a novel semantic face feature loss. Qualitative and quantitative results demonstrate that UP-FacE enables precise and fine-grained control over 23 face shape features.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper aims to address the issue of precise control in facial editing. Existing facial editing methods (such as attribute-based methods, mask-based methods, and unsupervised methods) can produce impressive results, but they all have some fundamental issues that make it difficult for users to easily and definitively edit facial shapes. Specifically: - **Attribute-based and mask-based methods** require manual annotation or sufficient skill to manipulate segmentation masks, which is very cumbersome. - **Unsupervised methods** overcome the above limitations but are usually model-dependent and may fail to discover the desired attribute dimensions or have these dimensions entangled with other attributes. - **3D-based methods** are effective in new view synthesis, lighting manipulation, and expression transfer, but require a lot of 3D modeling work for facial shape editing. Most importantly, none of these methods allow users to know the outcome of a specific facial edit in advance. Users have to adjust facial features through trial and error until they are satisfied with the results, leading to a time-consuming and error-prone facial editing process. To address these issues, the paper proposes "User-predictable Face Editing" (UP-FacE). UP-FacE can manipulate 23 facial features that describe key geometric characteristics of the human face, such as eye width, mouth openness, or the shape of the chin and eyebrows. Unlike existing methods, the editing results of UP-FacE are predictable, allowing users to precisely control the degree of change and know the required amount of change before making edits. Additionally, this method allows for isolated progressive edits (i.e., continuous edits of the same feature) and sequential edits (i.e., edits of multiple different features) without altering other unrelated facial features.