Image Translation Model Based on Multi-level Feature Fusion

Zihao Wang,Ansheng Deng,Zheng Zhang
DOI: https://doi.org/10.1145/3644523.3644582
2024-01-01
Abstract:In the field of computer vision, the task of image translation belongs to the key puzzle, whose goal is to transform the input image from the source domain to the target domain while maintaining the original semantics and structure of the image. However, due to domain differences between source domain and target domain, existing image translation models often fail to capture these differences accurately, resulting in poor detail features and poor quality of the generated target domain images. To overcome these challenges, this paper proposes an image translation model based on multi-level feature fusion. Firstly, the channel self-attention module is introduced into the generator to improve the feature extraction ability of the translation model for key pixels. Secondly, the discriminator adopts multi-level discriminant structure, and fuses the different size feature maps generated by the generator into the discriminator for style feature extraction, which further enhances the robustness and generalization ability of the model. The experimental results show that the proposed model has high feature perception ability and can effectively identify domain differences between source domain and target domain, and further improve the quality of image translation.
What problem does this paper attempt to address?