SketchINR: A First Look into Sketches as Implicit Neural Representations

Hmrishav Bandyopadhyay,Ayan Kumar Bhunia,Pinaki Nath Chowdhury,Aneeshan Sain,Tao Xiang,Timothy Hospedales,Yi-Zhe Song
2024-03-14
Abstract:We propose SketchINR, to advance the representation of vector sketches with implicit neural models. A variable length vector sketch is compressed into a latent space of fixed dimension that implicitly encodes the underlying shape as a function of time and strokes. The learned function predicts the $xy$ point coordinates in a sketch at each time and stroke. Despite its simplicity, SketchINR outperforms existing representations at multiple tasks: (i) Encoding an entire sketch dataset into a fixed size latent vector, SketchINR gives $60\times$ and $10\times$ data compression over raster and vector sketches, respectively. (ii) SketchINR's auto-decoder provides a much higher-fidelity representation than other learned vector sketch representations, and is uniquely able to scale to complex vector sketches such as FS-COCO. (iii) SketchINR supports parallelisation that can decode/render $\sim$$100\times$ faster than other learned vector representations such as SketchRNN. (iv) SketchINR, for the first time, emulates the human ability to reproduce a sketch with varying abstraction in terms of number and complexity of strokes. As a first look at implicit sketches, SketchINR's compact high-fidelity representation will support future work in modelling long and complex sketches.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The paper mainly aims to address the following issues: ### Research Background and Objectives - **Current Challenges**: Existing digital sketch representation methods have limitations, including: - **Raster sketches** are compatible with convolutional neural networks (ConvNets) but lose the drawing order information; - **Vector sketches** retain time and stroke order information but require more storage as image complexity increases; - **Learned representations** (such as SketchRNN) have high computational costs, slow decoding speeds, and insufficient reconstruction accuracy when dealing with long sequences or complex images. - **Research Objectives**: Propose a new implicit neural representation method—SketchINR, for efficient and high-fidelity representation of digital sketches, overcoming the limitations of existing methods. ### Solution Overview - **Core Contribution**: For the first time, apply implicit neural representation to vector sketch modeling, representing sketches through timestamps and stroke tokens, achieving compact and high-fidelity sketch representation. - **Specific Contributions**: 1. **Compact and Efficient Representation**: SketchINR can encode an entire sketch dataset into a fixed-size latent space, offering better compression ratios than existing methods. 2. **High-Fidelity Reconstruction**: Compared to other learned vector sketch representations, SketchINR can more accurately reconstruct complex sketches. 3. **Fast Decoding**: SketchINR supports parallel decoding, which is approximately 100 times faster than autoregressive models. 4. **Diverse Applications**: SketchINR supports various downstream tasks, including generation, interpolation, completion, and abstraction. ### Technical Details - **Implicit Function**: Represent sketch points and stroke states through an implicit function $f_\theta$, which depends on timestamps $t_j$ and stroke tokens $s_k$. - **Loss Function**: Optimize using mean squared error (MSE) loss combined with visual loss to ensure the reconstruction is not only numerically close to the original but also visually similar. - **Multiple Sketch Representation**: Introduce a sketch descriptor $\nu_i$ to represent each sketch instance and achieve a universal representation of multiple sketches through a shared decoder. - **Generative Model**: Use a variational autoencoder (VAE) to generate feature descriptors $\nu_i$, thereby creating new implicit sketches. ### Experimental Validation - **Datasets**: Experiments are conducted using datasets of varying complexity, including FS-COCO, Sketchy, and Quick Draw!. - **Performance Evaluation**: Quantitative evaluation metrics include Chamfer distance and retrieval accuracy; qualitative evaluation demonstrates SketchINR's reconstruction quality on both simple and complex sketches. - **Application Scenarios**: Besides high-fidelity reconstruction, SketchINR's applications in sketch compression, interpolation (creative blending), generation, etc., are also demonstrated. In summary, this paper proposes a new implicit neural representation method, SketchINR, aimed at addressing the limitations of existing sketch representation methods, especially in handling complex or long-sequence sketches. SketchINR not only improves representation efficiency and reconstruction accuracy but also expands the application range of sketch representation in downstream tasks.