Lmser-pix2seq: Learning stable sketch representations for sketch healing

Tengjie Li,Sicong Zang,Shikui Tu,Lei Xu
DOI: https://doi.org/10.2139/ssrn.4364287
IF: 4.886
2024-01-17
Computer Vision and Image Understanding
Abstract:Sketch healing aims to recreate a complete sketch from the corrupted one. Sketches are abstract and sparse, making it impossible for neural networks to learn high-quality representations of sketches that include colors, textures, and other details. This presents a significant challenge for sketch healing. The features extracted from the corrupted sketch may be inconsistent with the ones from the corresponding full sketch. In this paper, we present Lmser-pix2seq to learn stable sketch representations against the missing information by employing a Least mean square error reconstruction (Lmser) block, which falls into encoder–decoder paradigm. Taking as input a corrupted sketch, the Lmser encoder computes the embeddings of structural patterns of the input, while the decoder reconstructs the complete sketch from the embeddings. We build bi-directional skip connections between the encoder and the decoder in our Lmser block. The feedback connections enable recurrent paths to receive more information about the reconstructed sketch produced by the decoder, which helps the encoder extract stable sketch features. The features captured by the Lmser block are eventually fed into a recurrent neural network decoder to recreate the sketches. We also find that compared with the vanilla convolutional neural networks, our gated multilayer perceptron (gMLP) block based network captures the long-range dependence of different regions in the sketch image, automatically learns the relationship between patches and extracts the individual-specific features from the sketch more effectively. Experimental results show that our Lmser-pix2seq outperforms the state-of-the-art methods in sketch healing, especially when the sketches are heavily masked or corrupted.
computer science, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?