Text-Guided Image Manipulation Based on Sentence-Aware and Word-Aware Network

Zhiqiang Zhang,Chen Fu,Man M. Ho,Jinjia Zhou,Ning Jiang,Wenxin Yu
DOI: https://doi.org/10.1109/icme52920.2022.9859585
2022-01-01
Abstract:Text-guided image manipulation aims to use the given text description to modify the semantic content of the corresponding part in the input image. Although researchers have been obtained satisfactory performance in this field, they only 1) utilize the global sentence information at the initial modification stage and 2) exploit the fixed word information for regional adjustment in the subsequent modification process, hindering the improvement of image manipulation quality. Motivated by the mentioned issues, this paper proposes a novel approach to improve the performance of text-guided image manipulation by using sentence-aware and word-aware network. Concretely, we utilize global sentence information throughout the image manipulation process to improve the semantic consistency with the input text. On the other hand, we employ the dynamic selection method to dynamically adjust the word information corresponding to the regional image content to further improve the manipulation quality. As a result, our work surpasses the existing state-of-the-art methods on CUB and Oxford-102 flower datasets, demonstrating our effectiveness and superiority. In terms of Inception Score, our proposed method performs the most excellent performance. In terms of NIMA, the score of our method is closest to the score of the original dataset images, proving that our manipulated results are the most authentic.
What problem does this paper attempt to address?