WaveMixSR: A Resource-efficient Neural Network for Image Super-resolution

Pranav Jeevan,Akella Srinidhi,Pasunuri Prathiba,Amit Sethi
2023-07-02
Abstract:Image super-resolution research recently been dominated by transformer models which need higher computational resources than CNNs due to the quadratic complexity of self-attention. We propose a new neural network -- WaveMixSR -- for image super-resolution based on WaveMix architecture which uses a 2D-discrete wavelet transform for spatial token-mixing. Unlike transformer-based models, WaveMixSR does not unroll the image as a sequence of pixels/patches. It uses the inductive bias of convolutions along with the lossless token-mixing property of wavelet transform to achieve higher performance while requiring fewer resources and training data. We compare the performance of our network with other state-of-the-art methods for image super-resolution. Our experiments show that WaveMixSR achieves competitive performance in all datasets and reaches state-of-the-art performance in the BSD100 dataset on multiple super-resolution tasks. Our model is able to achieve this performance using less training data and computational resources while maintaining high parameter efficiency compared to current state-of-the-art models.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The problem this paper attempts to address is the resource efficiency in Image Super-Resolution (SR). Specifically, current image super-resolution research is mainly dominated by transformer models, which, despite their superior performance, have high computational resource demands, particularly in the quadratic complexity of the self-attention mechanism. In contrast, Convolutional Neural Networks (CNNs) are more efficient in resource utilization but are not as effective as transformer models in capturing long-range dependencies. To overcome these challenges, the authors propose a new neural network called WaveMixSR, which is based on the WaveMix architecture and uses 2D Discrete Wavelet Transform (2D-DWT) for spatial token mixing. The main features of WaveMixSR include: 1. **Resource Efficiency**: Compared to transformer models, WaveMixSR is more efficient in terms of computational resources and the number of parameters. 2. **High Performance**: Despite its lower resource demands, WaveMixSR can achieve or exceed the performance of existing state-of-the-art methods on multiple datasets. 3. **Data Efficiency**: WaveMixSR can achieve high performance with less training data, without the need for large-scale pre-training. With these features, WaveMixSR aims to significantly reduce the resource consumption of image super-resolution tasks while maintaining high performance.