Computation-efficient Deep Learning for Computer Vision: A Survey

Yulin Wang,Yizeng Han,Chaofei Wang,Shiji Song,Qi Tian,Gao Huang
2023-08-27
Abstract:Over the past decade, deep learning models have exhibited considerable advancements, reaching or even exceeding human-level performance in a range of visual perception tasks. This remarkable progress has sparked interest in applying deep networks to real-world applications, such as autonomous vehicles, mobile devices, robotics, and edge computing. However, the challenge remains that state-of-the-art models usually demand significant computational resources, leading to impractical power consumption, latency, or carbon emissions in real-world scenarios. This trade-off between effectiveness and efficiency has catalyzed the emergence of a new research focus: computationally efficient deep learning, which strives to achieve satisfactory performance while minimizing the computational cost during inference. This review offers an extensive analysis of this rapidly evolving field by examining four key areas: 1) the development of static or dynamic light-weighted backbone models for the efficient extraction of discriminative deep representations; 2) the specialized network architectures or algorithms tailored for specific computer vision tasks; 3) the techniques employed for compressing deep learning models; and 4) the strategies for deploying efficient deep networks on hardware platforms. Additionally, we provide a systematic discussion on the critical challenges faced in this domain, such as network architecture design, training schemes, practical efficiency, and more realistic model compression approaches, as well as potential future research directions.
Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning,Multimedia
What problem does this paper attempt to address?
The paper primarily addresses the computational efficiency issues faced by deep learning models in the field of computer vision, aiming to solve the following core problems: 1. **Balancing Computational Efficiency and Performance**: Although deep learning models have achieved significant success in various visual tasks, even reaching or surpassing human-level performance, these models often require substantial computational resources. In practical applications (such as autonomous driving, mobile devices, robotics, etc.), this can lead to impractical power consumption, latency, or carbon emissions. 2. **Methods for Designing Efficient Deep Learning Models**: To tackle the aforementioned challenges, researchers are dedicated to developing computationally efficient deep learning models that minimize the computational cost during inference while maintaining satisfactory performance. This includes but is not limited to the following aspects: - Developing lightweight backbone networks for efficiently extracting discriminative feature representations from images, videos, or 3D scenes; - Designing specialized network architectures or algorithms for specific computer vision tasks; - Investigating deep learning model compression techniques; - Exploring strategies for deploying efficient deep networks on hardware platforms. 3. **Organization of the Review**: The paper provides an extensive and in-depth analysis of the rapidly evolving field of computationally efficient deep learning through four key areas: 1. The design of static or dynamic lightweight backbone networks for efficient and discriminative deep representation extraction; 2. Custom network architectures or algorithms tailored for specific computer vision tasks; 3. Deep learning model compression techniques; 4. Strategies for deploying efficient deep networks on hardware platforms. Through these studies, the paper aims to offer researchers in this rapidly developing field a comprehensive overview, summarize recent advancements, and highlight important challenges and future research directions.