Object Detection with Deep Learning: A Review

Zhong-Qiu Zhao,Peng Zheng,Shou-tao Xu,Xindong Wu
2019-04-16
Abstract:Due to object detection's close relationship with video analysis and image understanding, it has attracted much research attention in recent years. Traditional object detection methods are built on handcrafted features and shallow trainable architectures. Their performance easily stagnates by constructing complex ensembles which combine multiple low-level image features with high-level context from object detectors and scene classifiers. With the rapid development in deep learning, more powerful tools, which are able to learn semantic, high-level, deeper features, are introduced to address the problems existing in traditional architectures. These models behave differently in network architecture, training strategy and optimization function, etc. In this paper, we provide a review on deep learning based object detection frameworks. Our review begins with a brief introduction on the history of deep learning and its representative tool, namely Convolutional Neural Network (CNN). Then we focus on typical generic object detection architectures along with some modifications and useful tricks to improve detection performance further. As distinct specific detection tasks exhibit different characteristics, we also briefly survey several specific tasks, including salient object detection, face detection and pedestrian detection. Experimental analyses are also provided to compare various methods and draw some meaningful conclusions. Finally, several promising directions and tasks are provided to serve as guidelines for future work in both object detection and relevant neural network based learning systems.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper primarily addresses several key issues in the field of object detection and provides a comprehensive review of deep learning-based object detection methods. Specifically, the paper attempts to solve the following core problems: 1. **Limitations of traditional object detection methods**: Traditional methods rely on manually designed features (such as SIFT, HOG, etc.) and shallow learning architectures (such as SVM). These methods are limited in performance in complex scenarios, especially when dealing with changes in viewpoint, pose, occlusion, and lighting conditions. 2. **Introducing deep learning to enhance performance**: With the development of deep learning technology, particularly the application of Convolutional Neural Networks (CNNs), it is possible to learn more semantic high-level features, thereby overcoming the limitations of traditional methods. 3. **Improvements in object detection frameworks**: The paper details the development of models from early R-CNN to later Fast R-CNN, Faster R-CNN, and so on, and how the introduction of Region Proposal Networks (RPN), multi-task learning, and other methods have improved detection speed and accuracy. 4. **Object detection for specific tasks**: In addition to general object detection, the paper also discusses methods and techniques in specific tasks such as salient object detection, face detection, and pedestrian detection. 5. **Experimental analysis and future directions**: The paper provides experimental comparisons of various methods to evaluate their performance and proposes some potential directions for future research. In summary, this paper aims to provide a systematic review and summary of deep learning-based object detection methods, offering a comprehensive guide for researchers in the field of object detection and pointing out future research directions.