Environmental Condition Aware Super-Resolution Acceleration Framework in Server-Client Hierarchies
Zhuoran Song,Zhongkai Yu,Xinkai Song,Yifan Hao,Li Jiang,Naifeng Jing,Xiaoyao Liang
DOI: https://doi.org/10.1145/3678008
IF: 1.444
2024-01-01
ACM Transactions on Architecture and Code Optimization
Abstract:In the current landscape, high-resolution (HR) videos have gained immense popularity, promising an elevated viewing experience. Recent research has demonstrated that the video super-resolution (SR) algorithm, empowered by deep neural networks (DNNs), can substantially enhance the quality of HR videos by processing low-resolution (LR) frames. However, the existing DNN models demand significant computational resources, posing challenges for the deployment of SR algorithms on client devices. While numerous accelerators have proposed solutions, their primary focus remains on client-side optimization. In contrast, our research recognizes that the HR video is originally stored in the cloud server and presents an untapped opportunity for achieving both high accuracy and performance improvements. Building on this insight, this paper introduces an end-to-end video CODEC-assisted super-resolution (E2SR+) algorithm, which tightly integrates the cloud server with the client device to deliver a seamless and real-time video viewing experience. We propose the motion vector search algorithm executed in the cloud server, which can search the motion vectors and residuals for part of HR video frames and then pack them as addons. We also design an auto-encoder algorithm to down-sample the residuals to save the bitstream cost while guaranteeing the quality of the residuals. Lastly, we propose the reconstruction algorithm performed in the client to fast reconstruct the corresponding HR frames using the addons to skip part of DNN computations. To implement the E2SR+ algorithm, we design the corresponding E2SR+ architecture in the client, which achieves significant speedup with minimal hardware overhead. Given that the environmental condition varies in the server-client hierarchies, we believe that simply applying E2SR+ to all frames is irrational. Accordingly, we offer an environmental condition aware system to chase the best performance while adapting to the diverse environment. In the system, we design a linear programming (LP) model to simulate the environment and allocate frames to three existing mechanisms. Our experimental results demonstrate that the E2SR+ algorithm enhances the PSNR by 1.2, 2.5, and 2.3 compared to the SOTA methods “EDVR”, “BasicVSR”, and “BasicVSR++”, respectively. In terms of performance, the E2SR+ architecture offers significant improvements over existing SOTA methods. For instance, while BasicVSR++ requires 98ms on Nvidia V100 GPU to generate a 1280 × 720 HR frame, the E2SR+ architecture reduces the execution time to just 39ms, highlighting the efficiency and effectiveness of our proposed method. Overall, the E2SR+ architecture respectively achieves 1.4 ×, 2.2 ×, 4.6 ×, and 442.0 × performance improvement compared to ADAS, ISRAcc, NVIDIA V100 GPU, and CPU. Lastly, the proposed system showcases its superiority and surpasses all the existing mechanisms in terms of execution time when varying environmental conditions.