Hmrishav Bandyopadhyay,Ayan Kumar Bhunia,Pinaki Nath Chowdhury,Aneeshan Sain,Tao Xiang,Timothy Hospedales,Yi-Zhe Song
Abstract:We propose SketchINR, to advance the representation of vector sketches with implicit neural models. A variable length vector sketch is compressed into a latent space of fixed dimension that implicitly encodes the underlying shape as a function of time and strokes. The learned function predicts the $xy$ point coordinates in a sketch at each time and stroke. Despite its simplicity, SketchINR outperforms existing representations at multiple tasks: (i) Encoding an entire sketch dataset into a fixed size latent vector, SketchINR gives $60\times$ and $10\times$ data compression over raster and vector sketches, respectively. (ii) SketchINR's auto-decoder provides a much higher-fidelity representation than other learned vector sketch representations, and is uniquely able to scale to complex vector sketches such as FS-COCO. (iii) SketchINR supports parallelisation that can decode/render $\sim$$100\times$ faster than other learned vector representations such as SketchRNN. (iv) SketchINR, for the first time, emulates the human ability to reproduce a sketch with varying abstraction in terms of number and complexity of strokes. As a first look at implicit sketches, SketchINR's compact high-fidelity representation will support future work in modelling long and complex sketches.
What problem does this paper attempt to address?
The paper mainly aims to address the following issues:
### Research Background and Objectives
- **Current Challenges**: Existing digital sketch representation methods have limitations, including:
- **Raster sketches** are compatible with convolutional neural networks (ConvNets) but lose the drawing order information;
- **Vector sketches** retain time and stroke order information but require more storage as image complexity increases;
- **Learned representations** (such as SketchRNN) have high computational costs, slow decoding speeds, and insufficient reconstruction accuracy when dealing with long sequences or complex images.
- **Research Objectives**: Propose a new implicit neural representation method—SketchINR, for efficient and high-fidelity representation of digital sketches, overcoming the limitations of existing methods.
### Solution Overview
- **Core Contribution**: For the first time, apply implicit neural representation to vector sketch modeling, representing sketches through timestamps and stroke tokens, achieving compact and high-fidelity sketch representation.
- **Specific Contributions**:
1. **Compact and Efficient Representation**: SketchINR can encode an entire sketch dataset into a fixed-size latent space, offering better compression ratios than existing methods.
2. **High-Fidelity Reconstruction**: Compared to other learned vector sketch representations, SketchINR can more accurately reconstruct complex sketches.
3. **Fast Decoding**: SketchINR supports parallel decoding, which is approximately 100 times faster than autoregressive models.
4. **Diverse Applications**: SketchINR supports various downstream tasks, including generation, interpolation, completion, and abstraction.
### Technical Details
- **Implicit Function**: Represent sketch points and stroke states through an implicit function $f_\theta$, which depends on timestamps $t_j$ and stroke tokens $s_k$.
- **Loss Function**: Optimize using mean squared error (MSE) loss combined with visual loss to ensure the reconstruction is not only numerically close to the original but also visually similar.
- **Multiple Sketch Representation**: Introduce a sketch descriptor $\nu_i$ to represent each sketch instance and achieve a universal representation of multiple sketches through a shared decoder.
- **Generative Model**: Use a variational autoencoder (VAE) to generate feature descriptors $\nu_i$, thereby creating new implicit sketches.
### Experimental Validation
- **Datasets**: Experiments are conducted using datasets of varying complexity, including FS-COCO, Sketchy, and Quick Draw!.
- **Performance Evaluation**: Quantitative evaluation metrics include Chamfer distance and retrieval accuracy; qualitative evaluation demonstrates SketchINR's reconstruction quality on both simple and complex sketches.
- **Application Scenarios**: Besides high-fidelity reconstruction, SketchINR's applications in sketch compression, interpolation (creative blending), generation, etc., are also demonstrated.
In summary, this paper proposes a new implicit neural representation method, SketchINR, aimed at addressing the limitations of existing sketch representation methods, especially in handling complex or long-sequence sketches. SketchINR not only improves representation efficiency and reconstruction accuracy but also expands the application range of sketch representation in downstream tasks.