Abstract:There are still two problems in SDD causing some inaccurate results: (1) In the process of feature extraction, with the layer-by-layer acquisition of semantic information, local information is gradually lost, resulting into less representative feature maps; (2) During the Non-Maximum Suppression (NMS) algorithm due to inconsistency in classification and regression tasks, the classification confidence and predicted detection position cannot accurately indicate the position of the prediction boxes. Methods: In order to address these aforementioned issues, we propose a new architecture, a modified version of Single Shot Multibox Detector (SSD), named Precise Single Stage Detector (PSSD). Firstly, we improve the features by adding extra layers to SSD. Secondly, we construct a simple and effective feature enhancement module to expand the receptive field step by step for each layer and enhance its local and semantic information. Finally, we design a more efficient loss function to predict the IOU between the prediction boxes and ground truth boxes, and the threshold IOU guides classification training and attenuates the scores, which are used by the NMS algorithm. Main Results: Benefiting from the above optimization, the proposed model PSSD achieves exciting performance in real-time. Specifically, with the hardware of Titan Xp and the input size of 320 pix, PSSD achieves 33.8 mAP at 45 FPS speed on MS COCO benchmark and 81.28 mAP at 66 FPS speed on Pascal VOC 2007 outperforming state-of-the-art object detection models. Besides, the proposed model performs significantly well with larger input size. Under 512 pix, PSSD can obtain 37.2 mAP with 27 FPS on MS COCO and 82.82 mAP with 40 FPS on Pascal VOC 2007. The experiment results prove that the proposed model has a better trade-off between speed and accuracy.

TDFSSD: Top-Down Feature Fusion Single Shot MultiBox Detector

FSSD: Feature Fusion Single Shot Multibox Detector

SSD: Single Shot MultiBox Detector

FFR-SSD: feature fusion and reconstruction single shot detector for multi-scale object detection

A Rich Feature Fusion Single-Stage Object Detector.

Single-Shot Object Detection via Feature Enhancement and Channel Attention

DSSD : Deconvolutional Single Shot Detector

Extend the Shallow Part of Single Shot MultiBox Detector Via Convolutional Neural Network

FESSD:SSD target detection based on feature fusion and feature enhancement

MDFN: Multi-scale deep feature learning network for object detection

Comprehensive Feature Enhancement Module For Single-Shot Object Detector

Precise Single-stage Detector

Small Object Detection Algorithm Based on Feature Pyramid-Enhanced Fusion SSD

Multi-Source Features Fusion Single Stage 3D Object Detection with Transformer.

Detecting Small Objects in Thermal Images Using Single-Shot Detector

DPSSD: Dual-Path Single-Shot Detector

ASFD: Automatic and Scalable Face Detector

Cross-scale information enhancement for object detection

CvT-ASSD: Convolutional vision-Transformer Based Attentive Single Shot MultiBox Detector

3DSSD: Point-based 3D Single Stage Object Detector

Single Shot Object Detection with Top-Down Refinement.