Sampling-Priors-Augmented Deep Unfolding Network for Robust Video Compressive Sensing

Yuhao Huang,Gangrong Qu,Youran Ge
2023-07-14
Abstract:Video Compressed Sensing (VCS) aims to reconstruct multiple frames from one single captured measurement, thus achieving high-speed scene recording with a low-frame-rate sensor. Although there have been impressive advances in VCS recently, those state-of-the-art (SOTA) methods also significantly increase model complexity and suffer from poor generality and robustness, which means that those networks need to be retrained to accommodate the new system. Such limitations hinder the real-time imaging and practical deployment of models. In this work, we propose a Sampling-Priors-Augmented Deep Unfolding Network (SPA-DUN) for efficient and robust VCS reconstruction. Under the optimization-inspired deep unfolding framework, a lightweight and efficient U-net is exploited to downsize the model while improving overall performance. Moreover, the prior knowledge from the sampling model is utilized to dynamically modulate the network features to enable single SPA-DUN to handle arbitrary sampling settings, augmenting interpretability and generality. Extensive experiments on both simulation and real datasets demonstrate that SPA-DUN is not only applicable for various sampling settings with one single model but also achieves SOTA performance with incredible efficiency.
Computer Vision and Pattern Recognition,Image and Video Processing
What problem does this paper attempt to address?
This paper attempts to solve several key problems in video compressive sensing (VCS), which limit the effectiveness and universality of existing methods in practical applications. Specifically, the paper focuses on the following points: 1. **High complexity and low efficiency**: Although existing deep - network - based methods have improved in performance, their complex model structures lead to high training and inference costs, affecting the possibility of real - time imaging and practical deployment. 2. **Poor generalization ability and robustness**: Most existing deep - network methods need to be retrained for specific sampling settings to adapt to new system configurations. This not only increases the demand for storage space but also consumes a large amount of time cost. In addition, these models perform poorly when faced with unseen sampling settings. 3. **Lack of physical guidance**: Existing end - to - end (E2E) networks do not have clear physical guidance when learning the recovery mapping, which increases the learning difficulty of the model and makes its performance depend on carefully designed architectures. To solve the above problems, the paper proposes a **Sampling Prior - Augmented Deep Unfolding Network (SPA - DUN)** aiming to achieve efficient and robust video compressive sensing reconstruction. The main contributions include: - **Light - weight and efficient U - net**: By extracting the key components of advanced image - to - image networks, a more concise and effective U - net is designed as the backbone network, which significantly reduces the model complexity while increasing the network capacity. - **Sampling Prior - Augmented Learning (SPA - Learning)**: A sampling prior - augmented strategy is proposed at the training and architecture levels, enabling the network to be robust to unseen sampling settings without retraining. - **New benchmark performance**: A new state - of - the - art level is established in terms of reconstruction effect, model complexity, computational speed and generalization ability, promoting the application in practical VCS systems. Through these innovations, SPA - DUN not only performs well under multiple sampling settings, but also has obvious advantages in computational efficiency and model complexity, providing strong support for the practical application of video compressive sensing.