Abstract:Deep-learning-based object detectors have substantially improved state-of-the-art object detection in remote sensing images in terms of precision and degree of automation. Nevertheless, the large variation of the object scales makes it difficult to achieve high-quality detection across multiresolution remote sensing images, where the quality is defined by the Intersection over Union (IoU) threshold used in training. In addition, the imbalance between the positive and negative samples across multiresolution images worsens the detection precision. Recently, it was found that a Cascade region-based convolutional neural network (R-CNN) can potentially achieve a higher quality of detection by introducing a cascaded three-stage structure using progressively improved IoU thresholds. However, the performance of Cascade R-CNN degraded when the fourth stage was added. We investigated the cause and found that the mismatch between the ROI features and the classifier could be responsible for the degradation of performance. Herein, we propose a Cascade R-CNN++ structure to address this issue and extend the three-stage architecture to multiple stages for general use. Specifically, for cascaded classification, we propose a new ensemble strategy for the classifier and region of interest (RoI) features to improve classification accuracy at inference. In localization, we modified the loss function of the bounding box regressor to obtain higher sensitivity around zero. Experiments on the DOTA dataset demonstrated that Cascade R-CNN++ outperforms Cascade R-CNN in terms of precision and detection quality. We conducted further analysis on multiresolution remote sensing images to verify model transferability across different object scales.

A Multi-object Detection Sampling Algorithm for Large Scenes

A Lightweight SE-YOLOv3 Network for Multi-Scale Object Detection in Remote Sensing Imagery.

Multi-scene small object detection with modified YOLOv4

An improved YOLOv5 method for large objects detection with multi-scale feature cross-layer fusion network

A fast self-attention cascaded network for object detection in large scene remote sensing images

A Multi-Scale Traffic Object Detection Algorithm for Road Scenes Based on Improved YOLOv5

Parallel Cascade R-CNN for Object Detection in Remote Sensing Imagery

Road Scene Multi-Object Detection Algorithm Based on CMS-YOLO

Object Detection in Aerial Remote Sensing Images with Multi-scale Feature Enhancement

CF-YOLOX: An Autonomous Driving Detection Model for Multi-Scale Object Detection

AdaZoom: Towards Scale-Aware Large Scene Object Detection

Sampling Techniques for Large-Scale Object Detection From Sparsely Annotated Objects

A Lightweight Object Detection Algorithm for Remote Sensing Images Based on Attention Mechanism and YOLOv5s

High Quality Object Detection for Multiresolution Remote Sensing Imagery Using Cascaded Multi-Stage Detectors.

Adaptively Attentional Feature Fusion Oriented to Multiscale Object Detection in Remote Sensing Images

A YOLO-Based Method for Head Detection in Complex Scenes

Multi-scale Object Detection Algorithm in Smart City Based on Mixed Dilated Convolution Pyramid

Small object intelligent detection method based on adaptive recursive feature pyramid

Convolutional Neural Networks-Based Object Detection Algorithm by Jointing Semantic Segmentation for Images

SDSDet: A real-time object detector for small, dense, multi-scale remote sensing objects

SSN: Scale Selection Network for Multi-Scale Object Detection in Remote Sensing Images