FFFN: Frame-By-Frame Feedback Fusion Network for Video Super-Resolution

Jian Zhu,Qingwu Zhang,Lunke Fei,Ruichu Cai,Yuan Xie,Bin Sheng,Xiaokang Yang
DOI: https://doi.org/10.1109/tmm.2022.3214776
IF: 7.3
2022-01-01
IEEE Transactions on Multimedia
Abstract:Video super-resolution (VSR) is a fundamental and challenging task in computer vision. Many of the existing VSR works focus on how to effectively align neighboring frames to better incorporate temporal information, while little work is devoted to the important subsequent step of inter-frame information fusion, and the existing methods on frame fusion have shortcomings such as not being able to make full use of spatio-temporal information. In this work, we propose a Frame-by-frame Feedback Fusion Network (FFFN) for VSR tasks. By applying the feedback learning mechanism commonly existing in the human cognitive system to the frame fusion stage, FFFN can refine low-level representation of the fused frames with high-level information in a coarse-to-fine manner. Specifically, after the neighboring frames are aligned, we first rearrange them from near to far according to the distance from the reference frame in the temporal space, and then feed them one-by-one into a proposed recurrent structure called Feedback Fusion Module (FFM), which is then able to iteratively generate high-level representation of the fused frames with several Feature Refinement Groups (FRGs) and feedback connections. Finally, we design a Dual-path Residual Reconstruction Module (DRRM) to reconstruct the final high-resolution image. The proposed FFFN comes with a strong frame fusion and reconstruction ability, and extensive experiments on several benchmark data sets show that it achieves favorable performance against state-of-the-art methods.
What problem does this paper attempt to address?