Abstract:This is a revised version of the authors' manuscript, which proposes a lightweight VFI network based on multiple lightweight convolutional units and the three‐scale encoding‐decoding structure. The proposed model learns features in an adaptive method to ensure an effective inference of motion information. Moreover, the authors design a lightweight convolutional unit S_RRCU to decrease the model parameters. Video frame interpolation (VFI) is a technique that synthesises intermediate frames between adjacent original video frames to enhance the temporal super‐resolution of the video. However, existing methods usually rely on heavy model architectures with a large number of parameters. The authors introduce an efficient VFI network based on multiple lightweight convolutional units and a Local three‐scale encoding (LTSE) structure. In particular, the authors introduce a LTSE structure with two‐level attention cascades. This design is tailored to enhance the efficient capture of details and contextual information across diverse scales in images. Secondly, the authors introduce recurrent convolutional layers (RCL) and residual operations, designing the recurrent residual convolutional unit to optimise the LTSE structure. Additionally, a lightweight convolutional unit named separable recurrent residual convolutional unit is introduced to reduce the model parameters. Finally, the authors obtain the three‐scale decoding features from the decoder and warp them for a set of three‐scale pre‐warped maps. The authors fuse them into the synthesis network to generate high‐quality interpolated frames. The experimental results indicate that the proposed approach achieves superior performance with fewer model parameters.

Enhanced spatial-temporal freedom for video frame interpolation

Spatio-Temporal Deformable Convolution for Compressed Video Quality Enhancement

Multiframe Interpolation for Video Using Phase Features

Enhancing Deformable Convolution based Video Frame Interpolation with Coarse-to-fine 3D CNN

LADDER: An Efficient Framework for Video Frame Interpolation

Video Frame Interpolation with Densely Queried Bilateral Correlation

Dynamic Frame Interpolation in Wavelet Domain

Video frame interpolation via spatial multi‐scale modelling

Video Frame Interpolation without Temporal Priors

H-VFI: Hierarchical Frame Interpolation for Videos with Large Motions

Perception-Oriented Video Frame Interpolation via Asymmetric Blending

Boost Video Frame Interpolation via Motion Adaptation

Arbitrary Timestep Video Frame Interpolation with Time-Dependent Decoding

Video Frame Interpolation Based on Deformable Kernel Region

Motion-Aware Video Frame Interpolation

Event-based Video Frame Interpolation with Edge Guided Motion Refinement

Dynamic Video Frame Interpolation with integrated Difficulty Pre-Assessment

Video Frame Interpolation: A Comprehensive Survey

Fine-Grained Motion Estimation for Video Frame Interpolation

IDO-VFI: Identifying Dynamics via Optical Flow Guidance for Video Frame Interpolation with Events

Multiple Video Frame Interpolation via Enhanced Deformable Separable Convolution