MegaPortrait: Revisiting Diffusion Control for High-fidelity Portrait Generation

Han Yang,Sotiris Anagnostidis,Enis Simsar,Thomas Hofmann
2024-11-07
Abstract:We propose MegaPortrait. It's an innovative system for creating personalized portrait images in computer vision. It has three modules: Identity Net, Shading Net, and Harmonization Net. Identity Net generates learned identity using a customized model fine-tuned with source images. Shading Net re-renders portraits using extracted representations. Harmonization Net fuses pasted faces and the reference image's body for coherent results. Our approach with off-the-shelf Controlnets is better than state-of-the-art AI portrait products in identity preservation and image fidelity. MegaPortrait has a simple but effective design and we compare it with other methods and products to show its superiority.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the balance between identity and style in personalized portrait generation. Specifically, AI - driven portrait - generation models need to be able to effectively generalize to various artistic styles while maintaining the identity of the subject. This challenge has long hindered the development of this field. Generative portrait models must be able to adapt to various prompts and preferences, ensuring that the generated images can both faithfully capture the unique characteristics of each subject and reflect the intended style or mood. To solve this problem, the paper proposes the MegaPortrait system, which achieves a delicate balance between identity preservation and artistic adaptation through three carefully designed modules (Identity Net, Shading Net, and Harmonization Net). The specific contributions are as follows: 1. **Proposing a principled pipeline**: MegaPortrait effectively solves the identity - style dilemma in AI - driven portrait generation. By adopting different Controlnets and carefully selecting input conditions, the system provides a new perspective to achieve high performance without additional complex designs. 2. **Introducing a novel decoupling method**: To address the trade - off between identity and style, the paper proposes a method of decoupling shadow or color information from geometric structures through splitting and merging pipelines. This method is applied to realistic AI portrait generation for the first time, providing a unique perspective to overcome the complexity of mapping relationships. 3. **High - fidelity and visually striking personalized portrait generation**: Composed of three modules, Identity Net, Shading Net, and Harmonization Net, MegaPortrait can generate high - quality and visually appealing personalized portrait images. Compared with other research methods and top - tier products (such as Remini), MegaPortrait performs excellently in identity preservation and image fidelity. Through these improvements, MegaPortrait has not only made technological breakthroughs but also provided new ideas and directions for future personalized portrait - generation research.