Detail-Enhanced Video-Based Super-Resolution Networks

Jidong Wang,Wei Chen,Zhuhua Hu
DOI: https://doi.org/10.1109/ecnct63103.2024.10704266
2024-01-01
Abstract:Adequate exploration of temporal and spatial information is now the key to the success of learning-based video super-resolution (VSR), which is usually realized by recursive propagation. Due to the complexity of the video content, the features extracted by the deep network are to some extent “rough” or “fuzzy”, and such “rough” features will not only continue in the subsequent propagation, but will also be superimposed on the neighboring propagation. This kind of “rough” features will not only continue in the subsequent propagation, but also be superimposed on the neighboring propagation branches, thus affecting the quality of the reconstructed frames. To minimize the effect of “roughness” on the subsequent propagation, we perform feature extraction and enhancement by using a detail-enhanced feature extraction module. Specifically, each frame in the video sequence is first fed into a convolutional layer for initial feature extraction, and then the collected a priori information is fed into a residual network and a difference network for enhancement and generalization, respectively, to remove ambiguities and recover the details of the inputs, and finally, the features are merged to complete the deep extraction of the featuresNumerous experimental results demonstrate the effectiveness of our network structure, and significant performance gains are achieved on the REDS dataset, with a PSNR improvement of 0.12 dB compared to the BasicVSR++ network.
What problem does this paper attempt to address?