A Comprehensive Survey of Convolutions in Deep Learning: Applications, Challenges, and Future Trends

Abolfazl Younesi,Mohsen Ansari,MohammadAmin Fazli,Alireza Ejlali,Muhammad Shafique,Jörg Henkel
2024-02-24
Abstract:In today's digital age, Convolutional Neural Networks (CNNs), a subset of Deep Learning (DL), are widely used for various computer vision tasks such as image classification, object detection, and image segmentation. There are numerous types of CNNs designed to meet specific needs and requirements, including 1D, 2D, and 3D CNNs, as well as dilated, grouped, attention, depthwise convolutions, and NAS, among others. Each type of CNN has its unique structure and characteristics, making it suitable for specific tasks. It's crucial to gain a thorough understanding and perform a comparative analysis of these different CNN types to understand their strengths and weaknesses. Furthermore, studying the performance, limitations, and practical applications of each type of CNN can aid in the development of new and improved architectures in the future. We also dive into the platforms and frameworks that researchers utilize for their research or development from various perspectives. Additionally, we explore the main research fields of CNN like 6D vision, generative models, and meta-learning. This survey paper provides a comprehensive examination and comparison of various CNN architectures, highlighting their architectural differences and emphasizing their respective advantages, disadvantages, applications, challenges, and future trends.
Neural and Evolutionary Computing,Machine Learning
What problem does this paper attempt to address?
This paper is a comprehensive survey of convolution techniques in deep learning, focusing primarily on the applications, challenges, and future trends of convolutional neural networks (CNNs) in computer vision tasks. CNNs are widely used for tasks such as image classification, object detection, and image segmentation due to their performance. However, different types of CNNs (such as 1D, 2D, 3D convolutions, dilated convolutions, etc.) have their own characteristics and are suitable for different scenarios. The paper discusses the performance, limitations, and optimization methods of CNNs on resource-constrained devices, such as lightweight architectures and compression techniques. Furthermore, it discusses how various convolution techniques adapt to different AI applications and compares the advantages and disadvantages of existing CNN architectures, proposing a classification system based on design patterns rather than publication years. The paper also covers the latest trends such as attention mechanisms, generative adversarial networks (GANs), and meta-learning, and analyzes the performance of different CNN models on hardware, including their impact on accuracy, latency, and memory usage. Finally, the paper provides insights into future research directions to accelerate advancements in this field.