Spatio-Temporal Fusion Based Low-Loss Video Compression Algorithm for UAVs with Limited Processing Capability.

Qianyuan Zhang,Desheng Wan,Hao Chen,Lianghua Cheng,Jiayi Chen,Chaocan Xiang
DOI: https://doi.org/10.1007/978-981-97-0798-0_12
2024-01-01
Abstract:Real-time urban crowd surveillance is essential for riot supervision, epidemic prevention, and urban emergency management. Unmanned aerial vehicles (UAVs) provide a promising way for realtime crowd surveillance due to their convenient deployment and flexible mobility. However, the limited wireless transmission bandwidth and the large capacity of high-definition video pose great challenges to the real-time transmission of UAV-captured videos. Although existing edge computing-based video compression algorithms can partially solve this dilemma, the complexity of these algorithms makes them inapplicable for edge devices with limited processing capacity. To this end, we propose a lightweight spatiotemporal fusion based low-loss video compression algorithm, which consists of two parts: feature clustering-based temporal sampling and dynamic encoding-based spatial sampling. The first module clips the video content from a temporal perspective by identifying inter-frame redundancy. The second module compresses the video content from a spatial perspective by examining regions of interest (RoIs) within each frame and utilizing background filtering to analyze intraframe encoding. This lightweight algorithm effectively reduces the size of the video file while maintaining high-quality output, which is compatible with edge devices' constrained process power. The experimental results demonstrate that the proposed algorithm maintains minimal loss in crowd detection accuracy while reducing transmission latency by 31.3%.
What problem does this paper attempt to address?