Restore-RWKV: Efficient and Effective Medical Image Restoration with RWKV

Zhiwen Yang,Hui Zhang,Dan Zhao,Bingzheng Wei,Yan Xu
2024-07-31
Abstract:Transformers have revolutionized medical image restoration, but the quadratic complexity still poses limitations for their application to high-resolution medical images. The recent advent of RWKV in the NLP field has attracted much attention as it can process long sequences efficiently. To leverage its advanced design, we propose Restore-RWKV, the first RWKV-based model for medical image restoration. Since the original RWKV model is designed for 1D sequences, we make two necessary modifications for modeling spatial relations in 2D images. First, we present a recurrent WKV (Re-WKV) attention mechanism that captures global dependencies with linear computational complexity. Re-WKV incorporates bidirectional attention as basic for a global receptive field and recurrent attention to effectively model 2D dependencies from various scan directions. Second, we develop an omnidirectional token shift (Omni-Shift) layer that enhances local dependencies by shifting tokens from all directions and across a wide context range. These adaptations make the proposed Restore-RWKV an efficient and effective model for medical image restoration. Extensive experiments demonstrate that Restore-RWKV achieves superior performance across various medical image restoration tasks, including MRI image super-resolution, CT image denoising, PET image synthesis, and all-in-one medical image restoration. Code is available at: \href{<a class="link-external link-https" href="https://github.com/Yaziwel/Restore-RWKV.git" rel="external noopener nofollow">this https URL</a>}{<a class="link-external link-https" href="https://github.com/Yaziwel/Restore-RWKV" rel="external noopener nofollow">this https URL</a>}.
Image and Video Processing,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems Addressed by the Paper This paper aims to address key issues in the field of Medical Image Restoration (MedIR), particularly the limitations of existing models in efficiently handling high-resolution medical images. Specifically: 1. **Limitations of Existing Models**: - **CNN Models**: Limited by the size of convolutional kernels, their effective receptive field is constrained, making it difficult to capture a broader range of information to compensate for detail loss in low-quality images. - **Transformer Models**: Although capable of modeling long-range dependencies through self-attention mechanisms, their computational complexity grows quadratically with spatial resolution, making them unsuitable for high-resolution medical images. - **Mamba Models**: Despite having lower computational complexity, they still face challenges in achieving optimal receptive fields in 2D images. 2. **Proposed Methods**: - **Re-WKV Attention Mechanism**: Introduces a Recurrent WKV (Re-WKV) attention mechanism to capture global dependencies with linear computational complexity and effectively model 2D dependencies through bidirectional and recurrent attention mechanisms. - **Omni-Shift Token Layer**: Develops an Omni-Shift Token layer to enhance local dependencies by shifting tokens from all directions. 3. **Main Contributions**: - Proposes an efficient and effective medical image restoration model, Restore-RWKV, based on the RWKV model. - Demonstrates through experiments the superior performance of Restore-RWKV in various medical image restoration tasks, including MRI image super-resolution, CT image denoising, PET image synthesis, and comprehensive medical image restoration tasks. Through these innovations, the paper aims to provide a medical image restoration method that can effectively model both global and local dependencies while maintaining efficiency.