Abstract:Accurate alignment is crucial for video denoising. However, estimating alignment in noisy environments is challenging. This paper introduces a cascading refinement video denoising method that can refine alignment and restore images simultaneously. Better alignment enables restoration of more detailed information in each frame. Furthermore, better image quality leads to better alignment. This method has achieved SOTA performance by a large margin on the CRVD dataset. Simultaneously, aiming to deal with multi-level noise, an uncertainty map was created after each iteration. Because of this, redundant computation on the easily restored videos was avoided. By applying this method, the entire computation was reduced by 25% on average.

What problem does this paper attempt to address?

This paper attempts to solve two key problems in video denoising: 1. **Accurate Alignment**: - In video frames shot under low - light conditions, severe noise will inevitably occur, which not only affects the video quality but also affects subsequent video analysis tasks, such as visual odometry or tracking algorithms. These tasks are very sensitive to noise. - Video denoising algorithms need to estimate the alignment between frames and use the redundant information in unaligned video frames for restoration. However, it is very challenging to estimate accurate alignment in a noisy environment. Incorrect alignment will lead to artifacts in the recovered frames, and the alignment estimation error is likely to propagate from one frame to another, increasing the adverse effects. - Small or fast - moving objects, changing lighting conditions and occlusion will all seriously affect the alignment performance. 2. **Multi - level Noise Handling**: - Another challenge in video denoising is how to handle multi - level noise without prior noise information. The noise level depends on many factors, such as ISO, luminance, temperature, etc., and is usually unknown in practical applications. Even in different regions of the same image, the noise level may be different. - Existing deep - learning denoising methods either require a single noise level as input or rely on noise information as input. Some methods handle multi - level noise by adding additional network structures, but these modules are redundant for images with mild noise. ### Solutions Proposed in the Paper To solve the above problems, this paper proposes a **Cascading Refinement Video Denoising with Uncertainty Adaptivity** method, which specifically includes the following three main parts: 1. **Pre - denoising and Patch Matching**: - More accurate initial alignment estimates are obtained through pre - denoising, making subsequent iterative refinements easier. - Normalized Cross - Correlation (NCC) is used for patch matching to find the corresponding patches for each specific patch in the support frames. 2. **Cascading Refinement**: - An iterative refinement structure is adopted to estimate the alignment and restore the image simultaneously. After each iteration, the image quality and alignment accuracy will gradually improve. - Flow - guided Deformable Convolution is used to provide offset diversity, thereby improving the denoising performance. - The feature map of the reference frame is fused with the feature map of the aligned support frame as the input for the next iteration. 3. **Uncertainty Adaptivity**: - An uncertainty map is generated after each iteration to estimate the error between the recovered frame and the true value. - Whether to continue the iteration is determined according to the magnitude of the uncertainty. If the uncertainty is below a certain threshold, the current iteration result is directly output; otherwise, the refinement iteration is continued. - This method can reduce the computational cost, with an average reduction of 25% in the amount of computation. Through these innovations, this method has achieved significantly better performance than existing methods on the CRVD dataset and can flexibly adapt to videos with different noise levels.

Cascading Refinement Video Denoising with Uncertainty Adaptivity

Adaptive fuzzy filter algorithm for real-time video denoising

Spatial-Adaptive Network for Single Image Denoising

Frequency-Relevant Residual Learning for Multi-Modal Image Denoising.

Video super-resolution with phase-aided deformable alignment network

Denoising Adversarial Networks for Rain Removal and Reflection Removal.

Learning an Occlusion-Aware Network for Video Deblurring

Adversarial Monte Carlo denoising with conditioned auxiliary feature modulation

Unsupervised Coordinate-Based Video Denoising

A Multi-scale Video Denoising Algorithm for Raw Image

A new video denoising method using texture metric and adaptive structure variance

Combining Pre- and Post-Demosaicking Noise Removal for RAW Video

First image then video: A two-stage network for spatiotemporal video denoising

RViDeformer: Efficient Raw Video Denoising Transformer with a Larger Benchmark Dataset

Supervised Raw Video Denoising With a Benchmark Dataset on Dynamic Scenes

Learning Model-Blind Temporal Denoisers without Ground Truths

Towards Real-World Video Denosing: A Practical Video Denosing Dataset and Network

A 2D image 3D reconstruction function adaptive denoising algorithm

Joint Non-Gaussian Denoising and Superresolving of Raw High Frame Rate Videos

Real-time Streaming Video Denoising with Bidirectional Buffers

Learning Spatial and Spatio-Temporal Pixel Aggregations for Image and Video Denoising