Abstract:Ensemble methods are a reliable way to combine several models to achieve superior performance. However, research on the application of ensemble methods in the remote sensing object detection scenario is mostly overlooked. Two problems arise. First, one unique characteristic of remote sensing object detection is the Oriented Bounding Boxes (OBB) of the objects and the fusion of multiple OBBs requires further research attention. Second, the widely used deep learning object detectors provide a score for each detected object as an indicator of confidence, but how to use these indicators effectively in an ensemble method remains a problem. Trying to address these problems, this paper proposes OBBStacking, an ensemble method that is compatible with OBBs and combines the detection results in a learned fashion. This ensemble method helps take 1st place in the Challenge Track \textit{Fine-grained Object Recognition in High-Resolution Optical Images}, which was featured in \textit{2021 Gaofen Challenge on Automated High-Resolution Earth Observation Image Interpretation}. The experiments on DOTA dataset and FAIR1M dataset demonstrate the improved performance of OBBStacking and the features of OBBStacking are analyzed.

What problem does this paper attempt to address?

The problems that this paper attempts to solve mainly focus on two aspects: 1. **The fusion problem of Oriented Bounding Boxes (OBB) in object detection**: - In remote sensing images, the Oriented Bounding Box (OBB) of an object is its unique feature. Different from traditional horizontal bounding boxes, OBB can represent objects with arbitrary angles. Therefore, how to effectively fuse the OBBs generated by multiple models is a problem that requires further research. 2. **How to effectively utilize the confidence scores provided by deep - learning object detectors**: - Deep - learning object detectors provide a confidence score for each detected object as an indicator of its correctness. However, how to effectively use these confidence scores in the integration method remains a challenge. To address these problems, the paper proposes **OBBStacking**, an OBB - compatible integration method that combines multiple detection results in a learning - based manner. Specifically, OBBStacking solves the following two key problems: - **Model calibration, redundancy, and performance gap**: OBBStacking trains a meta - learner to combine the results of multiple models in the best way, while considering model calibration, redundancy, and performance differences. - **Fusion of Oriented Bounding Boxes**: OBBStacking proposes a new bounding - box fusion method suitable for Oriented Bounding Boxes. This method parameterizes the bounding box as position, width, height, and orientation, and fuses each parameter separately. Through these improvements, OBBStacking won first place in the Challenge Track Fine - grained Object Recognition in High - Resolution Optical Images, and its performance improvement was verified by experiments on the DOTA and FAIR1M datasets. ### Formula summary - **Form of the meta - learner**: \[ \sigma_{\text{WA}}(z)=\sigma(zw + b) \] where \(z = [z_1,z_2,\dots,z_M]\in\mathbb{R}^{2\times M}\) is the logit output from \(M\) member models, \(\sigma(z)=\frac{1}{1+\exp(-z)}\) is the logistic function, and \(w\in\mathbb{R}^M\) and \(b\in\mathbb{R}\) are the weight and intercept parameters of the meta - learner, respectively. - **Negative log - likelihood loss function**: \[ L =-\sum_{i = 1}^{n}\log(\sigma_{\text{WA}}(z_i)(y_i)) \] \[ L=-\sum_{i = 1}^{n}\log(\sigma(z_iw + b)(y_i)) \] - **Formula for Oriented Bounding Box fusion**: \[ o_{\text{fused}}^{(j)}=\frac{\sum_{p = 1}^{n}o_p^{(j)}s_p^*}{\sum_{p = 1}^{n}s_p^*},\quad j = 1,2,3,4 \] where \(s_p^*=\sigma(z_p^{(1)}w(l_p)+b)\), and \(l_p\) is the index of the source model. - **Direction parameter fusion**: \[ \theta_f=\frac{\sum_{p = 1}^{n}r(\theta_p,\theta_{MJ})s_p^*}{\sum_{p = 1}^{n}s_p^*}+\theta_{MJ} \] where \(r(\theta_1,\theta_{MJ})\) (the formula seems incomplete here).

OBBStacking: An Ensemble Method for Remote Sensing Object Detection

Robust Object Tracking with a Hierarchical Ensemble Framework

Feature Reassembly and Self-Attention for Oriented Object Detection in Remote Sensing Images

S3OD: Single Stage Small Object Detector from Scratch for Remote Sensing Images

(Sod)-O-3: Single Stage Small Object Detector From Scratch For Remote Sensing Images

Task-Aligned Oriented Object Detection in Remote Sensing Images

Toward Integrity and Detail With Ensemble Learning for Salient Object Detection in Optical Remote-Sensing Images

RoI Fusion Strategy with Self-Attention Mechanism for Object Detection in Remote Sensing Images

Learning Critical Features for Arbitrary-Oriented Object Detection in Remote-Sensing Optical Images

Oriented Object Detection Based on Foreground Feature Enhancement in Remote Sensing Images.

Fine-Grained Object Detection in Remote Sensing Images Via Adaptive Label Assignment and Refined-Balanced Feature Pyramid Network.

Fine-Grained Feature Enhancement for Object Detection in Remote Sensing Images

Single-Stage Detector With Dual Feature Alignment for Remote Sensing Object Detection

Adaptive multi-level feature fusion and attention-based network for arbitrary-oriented object detection in remote sensing imagery

Object Detection for Remote Sensing Images Based on Guided Anchoring and Feature Fusion.

Arbitrary-Oriented Dense Object Detection in Remote Sensing Imagery

Dual-Aligned Oriented Detector

Mining Oriented Information for Semi-Supervised Object Detection in Remote Sensing Images

Few-Shot Object Detection in Remote Sensing: Lifting the Curse of Incompletely Annotated Novel Objects

OPODet: Toward Open World Potential Oriented Object Detection in Remote Sensing Images

Improving Oriented Object Detection by Scene Classification and Task-Aligned Focal Loss