Boundary-aware GAN for multiple overlapping objects in layout-to-image generation

Fengnan Quan,Bo Lang
DOI: https://doi.org/10.1007/s00530-024-01287-y
IF: 3.9
2024-03-22
Multimedia Systems
Abstract:Existing layout-to-image generation methods based on generative adversarial networks (GANs) have made great progress. However, a common problem that has not been effectively solved is region missing; that is, as the number of target objects in an image increases, multiple overlapping objects cannot be accurately generated at the boundary. The generation of the overlapped object parts directly determines the quality difference between the generated image and the real image. To solve this problem, we propose the Boundary-Aware GAN model (BAGAN). We adopt a strategy of separately generating the foreground and background. In the foreground generation process, BAGAN first uses attention regularization to accurately locate the boundaries of overlapping objects and then generates a clear foreground image by sharing the parameters of two transfer networks. During background generation, BAGAN uses conditional normalization to improve the quality of the generated background image. To better judge the overlap generation quality, we design the BoundaryFID evaluation metric. We validate and test our model on the COCO-Stuff and Visual Genome datasets. The experimental results show that according to the general evaluation metrics and BoundaryFID, our model can achieve the best results compared with the state-of-the-art methods.
computer science, information systems, theory & methods
What problem does this paper attempt to address?