Adapting MIMO video restoration networks to low latency constraints

Valéry Dewil,Zhe Zheng,Arnaud Barral,Lara Raad,Nao Nicolas,Ioannis Cassagne,Jean-michel Morel,Gabriele Facciolo,Bruno Galerne,Pablo Arias
2024-08-22
Abstract:MIMO (multiple input, multiple output) approaches are a recent trend in neural network architectures for video restoration problems, where each network evaluation produces multiple output frames. The video is split into non-overlapping stacks of frames that are processed independently, resulting in a very appealing trade-off between output quality and computational cost. In this work we focus on the low-latency setting by limiting the number of available future frames. We find that MIMO architectures suffer from problems that have received little attention so far, namely (1) the performance drops significantly due to the reduced temporal receptive field, particularly for frames at the borders of the stack, (2) there are strong temporal discontinuities at stack transitions which induce a step-wise motion artifact. We propose two simple solutions to alleviate these problems: recurrence across MIMO stacks to boost the output quality by implicitly increasing the temporal receptive field, and overlapping of the output stacks to smooth the temporal discontinuity at stack transitions. These modifications can be applied to any MIMO architecture. We test them on three state-of-the-art video denoising networks with different computational cost. The proposed contributions result in a new state-of-the-art for low-latency networks, both in terms of reconstruction error and temporal consistency. As an additional contribution, we introduce a new benchmark consisting of drone footage that highlights temporal consistency issues that are not apparent in the standard benchmarks.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper attempts to solve two main problems of MIMO (Multiple - Input Multiple - Output) networks in low - latency video inpainting: 1. **Performance degradation within the stack**: Due to the reduction of the temporal receptive field, especially for the frames at the edge of the stack, the performance degrades significantly. This results in a characteristic modified sine waveform in the PSNR (Peak Signal - to - Noise Ratio) graph per frame. 2. **Temporal discontinuity at stack transitions**: MIMO networks have high temporal consistency within the stack but strong temporal discontinuity at stack transitions, which causes staircase - like motion artifacts. This effect may be masked in standard benchmarks but is particularly evident in stabilized videos. To solve these problems, the authors propose two methods: - **Recurrence Across Stacks (RAS)**: Increase the temporal receptive field through a recurrence mechanism, especially for the first few frames of each stack. - **Output Stack Overlapping (OSO)**: Smooth the temporal discontinuity at stack transitions by making the output stacks overlap and further improve the output quality. These methods can be applied to any MIMO architecture and have been verified on three state - of - the - art video denoising networks: M2Mnet, BasicVSR++, and ReMoNet. The experimental results show that these improvements not only improve the reconstruction error and temporal consistency but also demonstrate better performance on the drone video dataset. In addition, the authors introduce a new benchmark dataset containing 14 stabilized videos shot by drones to highlight the temporal consistency problem. This dataset is helpful for evaluating problems that are not fully manifested in existing benchmarks. In summary, the main contributions of this paper are: - Proposing two key problems of MIMO networks in low - latency settings and their solutions. - Demonstrating the effectiveness of the proposed methods through experiments and defining a new state - of - the - art level for low - latency applications. - Introducing a new evaluation dataset to better reflect the temporal consistency challenges in practical application scenarios.