Multi-Frequency Representation Enhancement with Privilege Information for Video Super-Resolution

Fei Li,Linfeng Zhang,Zikun Liu,Juan Lei,Zhenbo Li
DOI: https://doi.org/10.1109/iccv51070.2023.01177
2023-01-01
Abstract:CNN’s limited receptive field restricts its ability to capture long-range spatial-temporal dependencies, leading to unsatisfactory performance in video super-resolution (VSR). To tackle this challenge, this paper presents a novel multi-frequency representation enhancement module (MFE) that performs spatial-temporal information aggregation in the frequency domain. Specifically, MFE mainly includes a spatial-frequency representation enhancement branch which captures the long-range dependency in the spatial dimension, and an energy frequency representation enhancement branch to obtain the inter-channel feature relationship. Moreover, a novel model training method named privilege training is proposed to encode the privilege information from high-resolution videos to facilitate model training. With these two methods, we introduce a new VSR model named MFPI, which outperforms state-of-the-art methods by a large margin while maintaining good efficiency on various datasets, including REDS4, Vimeo, Vid4, and UDM10.
What problem does this paper attempt to address?