Event-Adapted Video Super-Resolution

Zeyu Xiao,Dachun Kai,Yueyi Zhang,Zheng-Jun Zha,Xiaoyan Sun,Zhiwei Xiong
DOI: https://doi.org/10.1007/978-3-031-72946-1_13
2024-01-01
Abstract:Introducing event cameras into video super-resolution (VSR) shows great promise. In practice, however, integrating event data as a new modality necessitates a laborious model architecture design. This not only consumes substantial time and effort but also disregards valuable insights from successful existing VSR models. Furthermore, the resource-intensive process of retraining these newly designed structures exacerbates the challenge. In this paper, inspired by recent success of parameter-efficient tuning in reducing the number of trainable parameters of a pre-trained model for downstream tasks, we introduce the Event AdapTER (EATER) for VSR. EATER efficiently utilizes pre-trained VSR model knowledge at the feature level through two lightweight and trainable components: the event-adapted alignment (EAA) unit and the event-adapted fusion (EAF) unit. The EAA unit aligns multiple frames based on the event stream in a coarse-to-fine manner, while the EAF unit efficiently fuses frames with the event stream through a multi-scaled design. Thanks to both units, EATER outperforms the full fine-tuning paradigm. Comprehensive experiments demonstrate the effectiveness of EATER, achieving superior results with parameter efficiency.
What problem does this paper attempt to address?