Generation of Super-Resolution for Medical Image Via a Self-Prior Guided Mamba Network with Edge-Aware Constraint

Zexin Ji,Beiji Zou,Xiaoyan Kui,Hua Li,Pierre Vera,Su Ruan
DOI: https://doi.org/10.1016/j.patrec.2024.11.020
IF: 4.757
2024-01-01
Pattern Recognition Letters
Abstract:Existing deep learning-based super-resolution generation approaches usually depend on the backbone of convolutional neural networks (CNNs) or Transformers. CNN-based approaches are unable to model long-range dependencies, whereas Transformer-based approaches encounter significant computational burdens due to quadratic complexity in calculations. Moreover, high-frequency texture details in images generated by existing approaches still remain indistinct, posing a major challenge in super-resolution tasks. To overcome these problems, we propose a self-prior guided Mamba network with edge-aware constraint (SEMambaSR) for medical image super-resolution. Recently, State Space Models (SSMs), notably Mamba, have gained prominence for the ability to efficiently model long-range dependencies with low complexity. In this paper, we propose to integrate Mamba into the Unet network allowing to extract multi-scale local and global features to generate high-quality super-resolution images. Additionally, we introduce perturbations by randomly adding a brightness window to the input image, enabling the network to mine the self-prior information of the image. We also design an improved 2D-Selective-Scan (ISS2D) module to learn and adaptively fuse multi-directional long-range dependencies in image features to enhance feature representation. An edge-aware constraint is exploited to learn the multi-scale edge information from encoder features for better synthesis of texture boundaries. Our qualitative and quantitative experimental findings indicate superior super-resolution performance over current methods on IXI and BraTS2021 medical datasets. Specifically, our approach achieved a PSNR of 33.44 dB and an SSIM of 0.9371 on IXI, and a PSNR of 41.99 dB and an SSIM of 0.9846 on BraTS2021, both for 2× upsampling. The downstream vision task on brain tumor segmentation, using a U-Net network, also reveals the effectiveness of our approach, with a mean Dice Score of 57.06% on the BraTS2021 dataset.
What problem does this paper attempt to address?