Abstract:<p>Small object detection is a highly challenging problem due to the limited resolution and information of small objects. Current state-of-the-art detectors only utilize the appearance feature to locate and classify objects. However, such detectors are prone to failure when detecting small objects, especially in the case of heavy appearance changes and background distractors, in which the appearance feature alone is not sufficient for robust detection. Exploiting context information in the surrounding scene can be highly beneficial in such cases. In this paper, we propose a novel detector, the Internal-External Network (IENet), which uses both the appearance and context information of the object for robust detection. In the proposed approach, small object detection is improved from feature extraction, proposal location, and classification. Specifically, three customized modules are designed, including the Bidirectional Feature Fusion Module (Bi-FFM), Context Reasoning Module (CRM), and Context Feature Augmentation Module (CFAM). Bi-FFM is designed to capture the internal feature of objects by transferring the semantic feature of deeper-level layers to lower-level layers and the detailed feature of lower-level layers to deeper-level layers in neural networks. The proposed approach not only utilizes the hierarchy of convolutional features but also improves its prediction via context relationships. CRM is designed to improve the quality of region proposals by context reasoning that uses easily detected objects to help understand hard ones. Furthermore, CFAM is designed to learn pair-wise relations between region proposals produced by CRM, and such relations are used to produce global feature information associated with the region proposals for accurate classification. Extensive experiments are conducted on the challenging COCO and WIDER FACE datasets to demonstrate the effectiveness of the proposed approach. Experimental results show that the detection performance of small objects is greatly improved over state-of-the-art detectors.</p>

Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks

SSF: Sparse Point Cloud Object Detection Based on Self-Adaptive Voxel Encoding and Focal-Sparse Convolution

Realize your surroundings: Exploiting context information for small object detection

RON: Reverse Connection with Objectness Prior Networks for Object Detection

OCNet: Object Context Network for Scene Parsing

Putting visual object recognition in context

Auto-Context R-CNN

Object Detection Algorithm Based on Context Information and Self-Attention Mechanism

Object Detection via Aspect Ratio and Context Aware Region-based Convolutional Networks

Learning to zoom: Exploiting mixed-scale contextual information for object detection

YOLOv8-CGRNet: A Lightweight Object Detection Network Leveraging Context Guidance and Deep Residual Learning

Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships

Beyond Skip Connections: Top-Down Modulation for Object Detection

Exploring Context Information for Accurate and Fast Object Detection

Adaptive adjacent context negotiation network for object detection in remote sensing imagery

R-FCN plus plus : Towards Accurate Region-Based Fully Convolutional Networks for Object Detection

Deep Convolutional Neural Networks For Pedestrian Detection With Skip Pooling

Local Contrast and Global Contextual Information Make Infrared Small Object Salient Again

Context-aware and Semantic-consistent Spatial Interactions for One-shot Object Detection without Fine-tuning

IdentifyNet for Non-Maximum Suppression

BAN: Focusing on Boundary Context for Object Detection