Feature Aggregating Network with Inter-Frame Interaction for Efficient Video Super-Resolution
Yawei Li,Zhao Zhang,Suiyi Zhao,Jicong Fan,Haijun Zhang,Mingliang Xu
DOI: https://doi.org/10.1109/icdm58522.2023.00042
2023-01-01
Abstract:Video super-resolution (VSR) on mobile devices aims to restore high-resolution frames from their low-resolution counterparts, satisfying the requirements of performance, FLOPs and latency. On one hand, partial feature processing, as a classic and acknowledged strategy, is developed in current studies to reach an appropriate trade-off between FLOPs and accuracy. However, the splitting of partial feature processing strategy are usually performed in a blind manner, thereby reducing the computational efficiency and performance gains. On the other hand, current methods for mobile platforms primarily treat VSR as an extension of single-image super-resolution to reduce model calculation and inference latency. However, lacking inter-frame information interaction in current methods results in a suboptimal latency and accuracy trade-off. To this end, we propose a novel architecture, termed Feature Aggregating Network with Inter-frame Interaction (FANI), a lightweight yet considering frame-wise correlation VSR network, which could achieve real-time inference while maintaining superior performance. Our FANI accepts adjacent multi-frame low-resolution images as input and generally consists of several fully-connection-embedded modules, i.e., Multi-stage Partial Feature Distillation (MPFD) for capturing multi-level feature representations. Moreover, considering the importance of inter-frame alignment, we further employ a tiny Attention-based Frame Alignment (AFA) module to promote inter-frame information flow and aggregation efficiently. Extensive experiments on the well-known dataset and real-world mobile device demonstrate the superiority of our proposed FANI, which means that our FANI could be well adapted to mobile devices and produce visually pleasing results.