CloudEye: A New Paradigm of Video Analysis System for Mobile Visual Scenarios

Huan Cui,Qing Li,Hanling Wang,Yong jiang
2024-10-24
Abstract:Mobile deep vision systems play a vital role in numerous scenarios. However, deep learning applications in mobile vision scenarios face problems such as tight computing resources. With the development of edge computing, the architecture of edge clouds has mitigated some of the issues related to limited computing resources. However, it has introduced increased latency. To address these challenges, we designed CloudEye which consists of Fast Inference Module, Feature Mining Module and Quality Encode Module. CloudEye is a real-time, efficient mobile visual perception system that leverages content information mining on edge servers in a mobile vision system environment equipped with edge servers and coordinated with cloud servers. Proven by sufficient experiments, we develop a prototype system that reduces network bandwidth usage by 69.50%, increases inference speed by 24.55%, and improves detection accuracy by 67.30%
Computer Vision and Pattern Recognition,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?
The paper attempts to address the challenges faced by deep learning applications in mobile visual scenarios, such as limited computational resources and increased latency. Specifically: 1. **Limited Computational Resources**: The computational power on mobile devices is limited, making it difficult to efficiently run complex deep learning models. 2. **High Latency**: Although edge computing architecture alleviates some computational resource constraints, it introduces additional latency, affecting real-time performance. 3. **High Bandwidth Consumption**: Offloading computational tasks entirely to the cloud consumes a large amount of network bandwidth, which can lead to significant performance degradation, especially under unstable network conditions. To address these issues, the paper proposes the CloudEye system, which achieves efficient and real-time mobile visual perception through the following three modules: 1. **Fast Inference Module**: - Utilizes historical information in the time series to predict the target distribution of the current frame. - Uses a Kalman filter for target tracking, generating reliable proposals and reducing redundant computations. - Dynamically compresses video frames to reduce transmission bandwidth. 2. **Feature Mining Module**: - Leverages high-precision inference results from cloud servers to extract target features of the current frame. - Combines lightweight models from edge servers to extract key features from video frames. - Dynamically selects reference frames and extraction methods to ensure the target trajectory conforms to motion patterns. 3. **Quality Encode Module**: - Dynamically compresses video frames based on regions of interest (ROI) to ensure the accuracy of key information while reducing the entropy of background information. - Optimizes the transmission efficiency of video frames, reducing bandwidth consumption. Through these modules, the CloudEye system demonstrated significant results in experiments, reducing network bandwidth usage by 69.50%, increasing inference speed by 24.55%, and improving detection accuracy by 67.30%.