Abstract:Due to object detection's close relationship with video analysis and image understanding, it has attracted much research attention in recent years. Traditional object detection methods are built on handcrafted features and shallow trainable architectures. Their performance easily stagnates by constructing complex ensembles which combine multiple low-level image features with high-level context from object detectors and scene classifiers. With the rapid development in deep learning, more powerful tools, which are able to learn semantic, high-level, deeper features, are introduced to address the problems existing in traditional architectures. These models behave differently in network architecture, training strategy and optimization function, etc. In this paper, we provide a review on deep learning based object detection frameworks. Our review begins with a brief introduction on the history of deep learning and its representative tool, namely Convolutional Neural Network (CNN). Then we focus on typical generic object detection architectures along with some modifications and useful tricks to improve detection performance further. As distinct specific detection tasks exhibit different characteristics, we also briefly survey several specific tasks, including salient object detection, face detection and pedestrian detection. Experimental analyses are also provided to compare various methods and draw some meaningful conclusions. Finally, several promising directions and tasks are provided to serve as guidelines for future work in both object detection and relevant neural network based learning systems.

What problem does this paper attempt to address?

The paper primarily addresses several key issues in the field of object detection and provides a comprehensive review of deep learning-based object detection methods. Specifically, the paper attempts to solve the following core problems: 1. **Limitations of traditional object detection methods**: Traditional methods rely on manually designed features (such as SIFT, HOG, etc.) and shallow learning architectures (such as SVM). These methods are limited in performance in complex scenarios, especially when dealing with changes in viewpoint, pose, occlusion, and lighting conditions. 2. **Introducing deep learning to enhance performance**: With the development of deep learning technology, particularly the application of Convolutional Neural Networks (CNNs), it is possible to learn more semantic high-level features, thereby overcoming the limitations of traditional methods. 3. **Improvements in object detection frameworks**: The paper details the development of models from early R-CNN to later Fast R-CNN, Faster R-CNN, and so on, and how the introduction of Region Proposal Networks (RPN), multi-task learning, and other methods have improved detection speed and accuracy. 4. **Object detection for specific tasks**: In addition to general object detection, the paper also discusses methods and techniques in specific tasks such as salient object detection, face detection, and pedestrian detection. 5. **Experimental analysis and future directions**: The paper provides experimental comparisons of various methods to evaluate their performance and proposes some potential directions for future research. In summary, this paper aims to provide a systematic review and summary of deep learning-based object detection methods, offering a comprehensive guide for researchers in the field of object detection and pointing out future research directions.

Object Detection with Deep Learning: A Review

A review of object detection based on deep learning

A Survey of Deep Learning-based Object Detection

Review of Deep Learning Based Object Detection Methods and Their Mainstream Frameworks

A review of object detection: Datasets, performance evaluation, architecture, applications and current trends

Recent advances in deep learning for object detection

Visual Object Detection: A Review

From classical techniques to convolution-based models: A review of object detection algorithms

A Survey of Object Detection Models Based on Deep Learning

A Survey of Image Object Detection Algorithm Based on Deep Learning

Convolutional neural network: a review of models, methodologies and applications to object detection

Survey of Object Detection Based on Deep Convolutional Network

A Comprehensive Review of One-stage Networks for Object Detection

Investigations of Object Detection in Images/Videos Using Various Deep Learning Techniques and Embedded Platforms—A Comprehensive Review

Recent advances in small object detection based on deep learning: A review

A Review of the Application of Convolutional Neural Networks in Object Detection

A review of small object detection based on deep learning

A Review of Remote Sensing Image Object Detection Algorithms Based on Deep Learning

Review of Object Detection Techniques

Deep Learning for Generic Object Detection: A Survey

Object Detection Using Deep Learning, CNNs and Vision Transformers: A Review