Data-Model-Circuit Tri-Design for Ultra-Light Video Intelligence on Edge Devices

Yimeng Zhang,Akshay Karkal Kamath,Qiucheng Wu,Zhiwen Fan,Wuyang Chen,Zhangyang Wang,Shiyu Chang,Sijia Liu,Cong Hao
DOI: https://doi.org/10.1145/3566097.3567864
2022-10-18
Abstract:In this paper, we propose a data-model-hardware tri-design framework for high-throughput, low-cost, and high-accuracy multi-object tracking (MOT) on High-Definition (HD) video stream. First, to enable ultra-light video intelligence, we propose temporal frame-filtering and spatial saliency-focusing approaches to reduce the complexity of massive video data. Second, we exploit structure-aware weight sparsity to design a hardware-friendly model compression method. Third, assisted with data and model complexity reduction, we propose a sparsity-aware, scalable, and low-power accelerator design, aiming to deliver real-time performance with high energy efficiency. Different from existing works, we make a solid step towards the synergized software/hardware co-optimization for realistic MOT model implementation. Compared to the state-of-the-art MOT baseline, our tri-design approach can achieve 12.5x latency reduction, 20.9x effective frame rate improvement, 5.83x lower power, and 9.78x better energy efficiency, without much accuracy drop.
Computer Vision and Pattern Recognition,Hardware Architecture
What problem does this paper attempt to address?