Adversarial Latent Autoencoder with Self-Attention for Structural Image Synthesis

Jiajie Fan,Laure Vuaille,Hao Wang,Thomas Bäck
DOI: https://doi.org/10.1109/CAI59869.2024.00030
2024-10-02
Abstract:Generative Engineering Design approaches driven by Deep Generative Models (DGM) have been proposed to facilitate industrial engineering processes. In such processes, designs often come in the form of images, such as blueprints, engineering drawings, and CAD models depending on the level of detail. DGMs have been successfully employed for synthesis of natural images, e.g., displaying animals, human faces and landscapes. However, industrial design images are fundamentally different from natural scenes in that they contain rich structural patterns and long-range dependencies, which are challenging for convolution-based DGMs to generate. Moreover, DGM-driven generation process is typically triggered based on random noisy inputs, which outputs unpredictable samples and thus cannot perform an efficient industrial design exploration. We tackle these challenges by proposing a novel model Self-Attention Adversarial Latent Autoencoder (SA-ALAE), which allows generating feasible design images of complex engineering parts. With SA-ALAE, users can not only explore novel variants of an existing design, but also control the generation process by operating in latent space. The potential of SA-ALAE is shown by generating engineering blueprints in a real automotive design task.
Computer Vision and Pattern Recognition,Computational Engineering, Finance, and Science,Image and Video Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to use deep generative models (DGMs) to generate high - quality engineering structure design images, especially for complex structure blueprints in industrial design. Specifically, the paper mainly focuses on the following aspects: 1. **Controlling the generation process**: Most existing deep generative models (such as GANs) lack controllability during the generation process and it is difficult to generate designs that meet the requirements according to specific needs. Especially in industrial design, it is difficult to preset input conditional variables to achieve fine - grained control. 2. **Handling the unique challenges of structural images**: Industrial design data (such as blueprints or engineering drawings) are essentially different from natural scene images (such as landscapes, human faces). The former contains rich structural patterns and long - range dependencies, while the latter is characterized by rich textures and continuous color gradients. Traditional convolution - based operations have difficulty in capturing these long - range dependencies. 3. **Improving the stability and performance of the generative model**: In order to ensure that the generative model has good stability and high - quality output during training and generation, modern techniques such as spectral normalization and ResNet need to be introduced to deal with problems such as training instability. To solve these problems, the paper proposes a new model - Self - Attention Adversarial Latent Autoencoder (SA - ALAE). By combining the advantages of the ALAE framework, the self - attention mechanism and the improved training strategy, SA - ALAE can generate high - quality engineering structure designs while providing a certain degree of controllability and effectively handle long - range dependency problems in structural images. ### Formula summary - **Generation mapping**: \[ G(\vec{\omega}, \vec{\eta}) : W\times\mathbb{R}^d\rightarrow X \] where \(X\) represents the data point space, \(\vec{\omega}\) is the latent variable, and \(\vec{\eta}\) is the optional Gaussian noise. - **Discriminant mapping**: \[ D(\vec{\omega}) : W\rightarrow\mathbb{R} \] - **Loss function**: \[ V(G\circ M, D\circ E)=\mathbb{E}_{\vec{x}\sim D}f(D\circ E(\vec{x}))+\mathbb{E}_{\vec{z}\sim N(\vec{0}, I)}f(-D\circ E\circ G\circ M(\vec{z})) \] where \(f(t)=\text{softplus}(t)=\log(1 + e^t)\). - **Latent variable calculation**: \[ \vec{\omega}=(1 - \mu)E(\vec{x})+\mu M(\vec{z}) \] where \(\mu\in[0, 1]\) is an adjustable parameter, \(\vec{x}\) is the original design, and \(\vec{z}\sim N(\vec{0}, I)\) is the sampling noise. Through these improvements and techniques, SA - ALAE performs excellently in generating high - quality engineering design images and can control the generation process by operating the latent space.