MambaPainter: Neural Stroke-Based Rendering in a Single Step

Tomoya Sawada,Marie Katsurai
2024-10-16
Abstract:Stroke-based rendering aims to reconstruct an input image into an oil painting style by predicting brush stroke sequences. Conventional methods perform this prediction stroke-by-stroke or require multiple inference steps due to the limitations of a predictable number of strokes. This procedure leads to inefficient translation speed, limiting their practicality. In this study, we propose MambaPainter, capable of predicting a sequence of over 100 brush strokes in a single inference step, resulting in rapid translation. We achieve this sequence prediction by incorporating the selective state-space model. Additionally, we introduce a simple extension to patch-based rendering, which we use to translate high-resolution images, improving the visual quality with a minimal increase in computational cost. Experimental results demonstrate that MambaPainter can efficiently translate inputs to oil painting-style images compared to state-of-the-art methods. The codes are available at <a class="link-external link-https" href="https://github.com/STomoya/MambaPainter" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the efficient conversion of input images into oil - painting - style images in a single inference step. Traditional methods usually predict the sequence of brush strokes stroke by stroke or through multiple inference steps, which results in inefficient conversion speed and limits their practical applications. This paper proposes a new method named MambaPainter, which can predict more than 100 brush stroke parameters in one inference step, thereby achieving fast and efficient image conversion. In addition, the author also introduces a simple patch - based rendering extension to handle high - resolution images, which can improve the visual quality with only a small increase in computational cost. ### Key points: 1. **Problem background**: Traditional stroke - based rendering (SBR) methods need multiple inference steps to predict the stroke sequence when generating oil - painting - style images, resulting in slow conversion speed and limited practicality. 2. **Solution**: MambaPainter utilizes the selective state - space model (SSM) and can predict more than 100 stroke parameters in a single inference step, significantly improving the conversion efficiency. 3. **Technical details**: - **Model architecture**: MambaPainter contains two main components: a stroke predictor and an image encoder. The stroke predictor uses selective SSM layers and cross - attention layers alternately stacked to predict the sequence of stroke parameters that can reconstruct the input source image. - **Training process**: First, it is trained with $L_2$ loss, and then, in order to improve the diversity of predicted strokes, a non - saturating generative adversarial network objective is added in the second half of the training. - **High - resolution image processing**: By simply expanding the patch - based rendering method, the discontinuity problem at the patch edges in traditional methods is solved while maintaining a low computational cost. 4. **Experimental results**: Compared with the existing state - of - the - art methods, MambaPainter shows higher conversion efficiency on images of different resolutions, and performs best in the LPIPS metric and second in the $L_2$ distance. ### Conclusion: MambaPainter provides an efficient method to convert input images into oil - painting - style in a single inference step, with excellent reconstruction performance and outperforms the existing state - of - the - art methods. Future work will evaluate the performance of this method in neural style transfer.