StyleGAN-Based Advanced Semantic Segment Encoder for Generative AI

Byungseok Kang,Youngjae Jo
DOI: https://doi.org/10.1109/mitp.2023.3338026
2024-05-03
IT Professional
Abstract:StyleGAN is a widely used model in various AI domains that generates high-quality images. It has many advantages but has the disadvantage of per-pixel noise inputs. These noise inputs used from StyleGAN are independent of location information and have a negative impact on natural location information learning because random noise is inserted in pixel units at intervals. This problem was even more problematic in the area of creating human faces. StyleGAN3 was announced to overcome this, but it did not completely solve the existing problems. If the angle of a human face is more than 30° from the front, the restoration rate further decreases. In this article, we propose an advanced semantic segment encoder that accurately generates eyes, nose, and mouth even when the angle of a human face is rotated more than 60°. We developed a face-angle analyzer to accurately measure the angle of a person's face. The proposed idea improved restoration performance by approximately 30% compared to existing encoders when the face is not straight ahead.
computer science, information systems,telecommunications, software engineering
What problem does this paper attempt to address?