ParaLoupe: Real-time Video Analytics on Edge Cluster Via Mini Model Parallelization

Hanling Wang,Qing Li,Haidong Kang,Dieli Hu,Lianbo Ma,Gareth Tyson,Zhenhui Yuan,Yong Jiang
DOI: https://doi.org/10.1109/tmc.2024.3438155
IF: 6.075
2024-01-01
IEEE Transactions on Mobile Computing
Abstract:Real-time video analytics on edge devices has gained increasing attention across a wide range of business areas. However, edge devices usually have limited computing resources. Consequently, conventional approaches to video analytics either deploy simplified models on the edge (resulting in low accuracy) or transmit video content to the cloud (resulting in high latency and network overheads) to enable deep learning inference ( e.g. object detection). In this paper, we introduce ParaLoupe, a novel real-time video analytics system that parallelizes deep learning inference in the edge cluster with task-oriented mini models. These mini models do not attain state-of-the-art accuracy individually, but collectively can achieve much better accuracy-latency tradeoff than state-of-the-art models. To achieve this, ParaLoupe crops multiple single-object patches from a given video frame. These single-object patches are then sent to multiple edge devices for parallel inference with specifically designed mini models. A patch-based task scheduling algorithm is further proposed to leverage the computing resources of the edge cluster to meet the service-level objectives. Our experimental results on real-world datasets show that ParaLoupe significantly outperforms baseline methods, achieving up to 14.1× inference speedup with accuracy on par with state-of-the-art models, or improving accuracy up to 45.1% under the same latency constraints.
What problem does this paper attempt to address?