Make It So: Steering StyleGAN for Any Image Inversion and Editing

Anand Bhattad,Viraj Shah,Derek Hoiem,D.A. Forsyth
DOI: https://doi.org/10.48550/arXiv.2304.14403
2023-04-27
Computer Vision and Pattern Recognition
Abstract:StyleGAN's disentangled style representation enables powerful image editing by manipulating the latent variables, but accurately mapping real-world images to their latent variables (GAN inversion) remains a challenge. Existing GAN inversion methods struggle to maintain editing directions and produce realistic results. To address these limitations, we propose Make It So, a novel GAN inversion method that operates in the $\mathcal{Z}$ (noise) space rather than the typical $\mathcal{W}$ (latent style) space. Make It So preserves editing capabilities, even for out-of-domain images. This is a crucial property that was overlooked in prior methods. Our quantitative evaluations demonstrate that Make It So outperforms the state-of-the-art method PTI~\cite{roich2021pivotal} by a factor of five in inversion accuracy and achieves ten times better edit quality for complex indoor scenes.
What problem does this paper attempt to address?