MobileNVC: Real-time 1080p Neural Video Compression on a Mobile Device

Ties van Rozendaal,Tushar Singhal,Hoang Le,Guillaume Sautiere,Amir Said,Krishna Buska,Anjuman Raha,Dimitris Kalatzis,Hitarth Mehta,Frank Mayer,Liang Zhang,Markus Nagel,Auke Wiggers
2023-11-15
Abstract:Neural video codecs have recently become competitive with standard codecs such as HEVC in the low-delay setting. However, most neural codecs are large floating-point networks that use pixel-dense warping operations for temporal modeling, making them too computationally expensive for deployment on mobile devices. Recent work has demonstrated that running a neural decoder in real time on mobile is feasible, but shows this only for 720p RGB video. This work presents the first neural video codec that decodes 1080p YUV420 video in real time on a mobile device. Our codec relies on two major contributions. First, we design an efficient codec that uses a block-based motion compensation algorithm available on the warping core of the mobile accelerator, and we show how to quantize this model to integer precision. Second, we implement a fast decoder pipeline that concurrently runs neural network components on the neural signal processor, parallel entropy coding on the mobile GPU, and warping on the warping core. Our codec outperforms the previous on-device codec by a large margin with up to 48% BD-rate savings, while reducing the MAC count on the receiver side by $10 \times$. We perform a careful ablation to demonstrate the effect of the introduced motion compensation scheme, and ablate the effect of model quantization.
Image and Video Processing,Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to achieve real - time neural video compression on mobile devices. Specifically, although existing neural video codecs can compete with standard codecs (such as HEVC) in low - latency settings, most neural codecs are too computationally expensive to be deployed on mobile devices due to the use of pixel - intensive warping operations for temporal modeling. Although recent work has demonstrated the feasibility of running neural decoders in real - time on mobile devices, this work is limited to 720p RGB videos. This paper proposes a neural video codec, MobileNVC, that can achieve real - time 1080p YUV420 video decoding on mobile devices. ### Main Contributions 1. **Efficient Codec Design**: MobileNVC designs an efficient codec, utilizes the block - based motion compensation algorithm on mobile accelerators, and shows how to quantize the model to integer precision. 2. **Fast Decoder Implementation**: Implements a fast decoder pipeline that can run neural network components simultaneously on neural signal processors, perform parallel entropy coding on mobile GPUs, and execute warping operations on warping cores. 3. **Performance Improvement**: Compared with previous mobile device codecs, MobileNVC significantly improves compression performance while maintaining low computational complexity, with a maximum BD - rate savings of 48%. ### Key Technologies - **Block - Based Motion Compensation**: Compared with pixel - intensive warping operations, block - based motion compensation can be more efficiently implemented on mobile devices, reducing the amount of computation. - **Quantization**: By quantizing weights and activations to 8 - bit integers, the inference efficiency is further improved. In particular, for the quantization of mean parameters, a low - precision quantization solution is proposed to avoid performance loss. - **Parallel Entropy Coding**: Parallel entropy coding is implemented on the GPU, greatly increasing the parallelism and thus improving the throughput. ### Experimental Results - **Compression Performance**: The BD - rate of MobileNVC on the HEVC - B and MCL - JCV datasets is 45% and 48% higher than that of MobileCodec, respectively. - **Computational Complexity**: The receiver - side complexity of MobileNVC is only 24.5 kMACs/pixel, much lower than other neural codecs. - **Inference Speed**: On a mobile device equipped with a Snapdragon 8 Gen 2 chip, the average receiver - side inference speed of MobileNVC reaches 38.9 FPS. In conclusion, through a series of technological innovations, this paper successfully achieves efficient and high - performance neural video compression on mobile devices, providing a new solution for video transmission and processing on mobile devices.