Detail Matters: Mamba-Inspired Joint Unfolding Network for Snapshot Spectral Compressive Imaging

Mengjie Qin,Yuchao Feng,Zongliang Wu,Yulun Zhang,Xin Yuan
2025-01-02
Abstract:In the coded aperture snapshot spectral imaging system, Deep Unfolding Networks (DUNs) have made impressive progress in recovering 3D hyperspectral images (HSIs) from a single 2D measurement. However, the inherent nonlinear and ill-posed characteristics of HSI reconstruction still pose challenges to existing methods in terms of accuracy and stability. To address this issue, we propose a Mamba-inspired Joint Unfolding Network (MiJUN), which integrates physics-embedded DUNs with learning-based HSI imaging. Firstly, leveraging the concept of trapezoid discretization to expand the representation space of unfolding networks, we introduce an accelerated unfolding network scheme. This approach can be interpreted as a generalized accelerated half-quadratic splitting with a second-order differential equation, which reduces the reliance on initial optimization stages and addresses challenges related to long-range interactions. Crucially, within the Mamba framework, we restructure the Mamba-inspired global-to-local attention mechanism by incorporating a selective state space model and an attention mechanism. This effectively reinterprets Mamba as a variant of the Transformer} architecture, improving its adaptability and efficiency. Furthermore, we refine the scanning strategy with Mamba by integrating the tensor mode-$k$ unfolding into the Mamba network. This approach emphasizes the low-rank properties of tensors along various modes, while conveniently facilitating 12 scanning directions. Numerical and visual comparisons on both simulation and real datasets demonstrate the superiority of our proposed MiJUN, and achieving overwhelming detail representation.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in the Coded Aperture Snapshot Spectral Imaging System (CASSI), the existing three - dimensional hyperspectral image (HSI) reconstruction methods face challenges in terms of accuracy and stability. Specifically, the inherent nonlinearity and ill - posed nature of HSI reconstruction make it difficult for existing methods to achieve high - quality reconstruction results. To solve these problems, the authors propose a Mamba - inspired Joint Unfolding Network (MiJUN), aiming to improve the reconstruction effect by combining physically - embedded Deep Unfolding Networks (DUNs) and learning - based HSI imaging techniques. The main contributions of MiJUN include: 1. **Accelerated Unfolding Network**: The concept of trapezoidal discretization is introduced, the representation space of the unfolding network is expanded, and an accelerated unfolding network scheme is proposed. This scheme can be interpreted as a generalized accelerated semi - quadratic splitting, has the form of a second - order differential equation, reduces the dependence on the initial optimization stage, and addresses the challenge of long - range interactions. 2. **Global - to - Local Attention Mechanism**: The selective state - space model and attention mechanism are reconstructed within the Mamba framework, effectively reinterpreting Mamba as a variant of the Transformer architecture, improving its adaptability and efficiency. 3. **Tensor Mode Unfolding Strategy**: For the first time, the tensor mode unfolding strategy is integrated into the Mamba module, simplifying complex tensor operations into relatively easy - to - handle matrix operations, emphasizing the low - rank property and achieving 12 - direction scanning. 4. **Accelerated HQ Decomposition (A - HQS)**: An interpretable A - HQS algorithm is introduced. Based on the iterative solution framework, redundant elements are effectively discarded, thus accelerating the iterative convergence. Through these improvements, MiJUN shows superior performance in numerical and visual comparisons on simulated and real - world datasets, especially in terms of detail representation, while reducing the computational cost. ### Formula Summary - **Degradation Model**: \[ Y=\sum_{n = 1}^{N_{\lambda}}\tilde{X}(:,:,n_{\lambda})\odot\tilde{M}(:,:,n_{\lambda})+B \] where \(Y\) is the measured value, \(B\) is the additive noise, and \(\tilde{X}\) and \(\tilde{M}\) are the shifted HSI data and mask respectively. - **Optimization Problem**: \[ \arg\min_{x}\frac{1}{2}\|y-\Phi x\|^{2}+\tau R(x) \] where \(R(x)\) is the regularization term and \(\tau\) is the noise balancing factor. - **Accelerated HQ Decomposition (A - HQS)**: \[ x^{k + 1}=\arg\min_{x}\frac{1}{2}\|y-\Phi x\|^{2}+\frac{\mu}{2}\|x-\hat{z}^{k + 1}\|^{2} \] \[ z^{k + 1}=\arg\min_{z}\frac{\mu}{2}\|x^{k + 1}-z\|^{2}+\tau R(z) \] \[ \hat{z}^{k + 1}=z^{k + 1}+\beta_{k + 1}(z^{k + 1}-z^{k}) \] These formulas describe the key steps and optimization processes for recovering HSI from a single measurement.