Abstract:Deep learning is currently the mainstream method of object detection. Faster region-based convolutional neural network (Faster R-CNN) has a pivotal position in deep learning. It has impressive detection effects in ordinary scenes. However, under special conditions, there can still be unsatisfactory detection performance, such as the object having problems like occlusion, deformation, or small size. This paper proposes a novel and improved algorithm based on the Faster R-CNN framework combined with the Faster R-CNN algorithm with skip pooling and fusion of contextual information. This algorithm can improve the detection performance under special conditions on the basis of Faster R-CNN. The improvement mainly has three parts: The first part adds a context information feature extraction model after the conv5_3 of the convolutional layer; the second part adds skip pooling so that the former can fully obtain the contextual information of the object, especially for situations where the object is occluded and deformed; and the third part replaces the region proposal network (RPN) with a more efficient guided anchor RPN (GA-RPN), which can maintain the recall rate while improving the detection performance. The latter can obtain more detailed information from different feature layers of the deep neural network algorithm, and is especially aimed at scenes with small objects. Compared with Faster R-CNN, you only look once series (such as: YOLOv3), single shot detector (such as: SSD512), and other object detection algorithms, the algorithm proposed in this paper has an average improvement of 6.857% on the mean average precision (mAP) evaluation index while maintaining a certain recall rate. This strongly proves that the proposed method has higher detection rate and detection efficiency in this case.

Supplemental Material: Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks

Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks

Object Detection Based on Faster R-CNN Algorithm with Skip Pooling and Fusion of Contextual Information

Tiny-RetinaNet: A One-Stage Detector for Real-Time Object Detection

Multi-scale Global Context Feature Pyramid Network for Object Detector

D-NMS: A Dynamic NMS Network for General Object Detection.

PoolNet plus : Exploring the Potential of Pooling for Salient Object Detection

A MultiPath Network for Object Detection

SSD: Single Shot MultiBox Detector

SFGNet Detecting Objects Via Spatial Fine-Grained Feature and Enhanced RPN with Spatial Context

Image Processing: Facilitating Retinanet for Detecting Small Objects

R-FCN plus plus : Towards Accurate Region-Based Fully Convolutional Networks for Object Detection

Efficient Selective Context Network for Accurate Object Detection

CenterNet: Keypoint Triplets for Object Detection

MSF-CSPNet: A Specially Designed Backbone Network for Faster R-CNN

DIGCN: A Dynamic Interaction Graph Convolutional Network Based on Learnable Proposals for Object Detection

SGCCNet: Single-Stage 3D Object Detector With Saliency-Guided Data Augmentation and Confidence Correction Mechanism

CoupleNet: Coupling Global Structure with Local Parts for Object Detection