Abstract:For object detection detectors, enhancing model performance hinges on the ability to simultaneously consider inconsistencies across tasks and focus on difficult-to-train samples. Achieving this necessitates incorporating information from both the classification and regression tasks. However, prior work tends to either emphasize difficult-to-train samples within their respective tasks or simply compute classification scores with IoU, often leading to suboptimal model performance. In this paper, we propose a Hybrid Classification-Regression Adaptive Loss, termed as HCRAL. Specifically, we introduce the Residual of Classification and IoU (RCI) module for cross-task supervision, addressing task inconsistencies, and the Conditioning Factor (CF) to focus on difficult-to-train samples within each task. Furthermore, we introduce a new strategy named Expanded Adaptive Training Sample Selection (EATSS) to provide additional samples that exhibit classification and regression inconsistencies. To validate the effectiveness of the proposed method, we conduct extensive experiments on COCO test-dev. Experimental evaluations demonstrate the superiority of our approachs. Additionally, we designed experiments by separately combining the classification and regression loss with regular loss functions in popular one-stage models, demonstrating improved performance.

What problem does this paper attempt to address?

The main problems that this paper attempts to solve are the inconsistency between classification and regression tasks in existing object detection models and the problem of samples that are difficult to train. Specifically: 1. **Inconsistency between classification and regression tasks**: In the design of existing loss functions, classification and regression tasks are usually processed separately, which leads to a lack of effective synergy between the two tasks. For example, some methods enhance performance by using IoU (Intersection over Union) as a classification label, but these methods fail to effectively focus on truly difficult - to - train samples, especially when dealing with samples with similar IoU values. 2. **Problem of difficult - to - train samples**: Existing methods often only emphasize difficult samples in either the classification or regression task, ignoring comprehensive information across tasks. For example, methods such as Focal Loss and GHM Loss mainly focus on difficult samples in the classification task, while Focal EIoU and Alpha IoU focus on difficult samples in the regression task. However, none of these methods fully consider the consistency between classification and regression tasks. To solve these problems, the paper proposes a new loss function - **Hybrid Classification - Regression Adaptive Loss (HCRAL)**, and improves model performance by introducing the following modules: - **RCI (Residual of Classification and IoU) module**: Used for cross - task supervision to solve the inconsistency between classification and regression tasks. The formula is expressed as: \[ RCI = s - iou+\alpha \] where \(s\) is the predicted classification score, \(iou\) is the position of the predicted bounding box and the real bounding box, and \(\alpha\) is an adjustment parameter. - **CF (Conditioning Factor) module**: Used to focus on difficult - to - train samples in each task. For the classification loss function, CF is defined as: \[ CF_{cls}(i)=\omega_i\times\beta_i \] where \(\omega_i\) is an adaptive matrix and \(\beta_i\) is the sample weight. For the regression loss function, CF is defined as: \[ CF_{reg}=t\cdot e^R\cdot IoU \] In addition, the paper also proposes a new positive and negative sample selection strategy - **Expanded Adaptive Training Sample Selection (EATSS)** to provide more optimized samples, especially those with high IoU or high classification scores. Through these improvements, HCRAL can better balance classification and regression tasks and effectively focus on difficult - to - train samples, thereby improving the overall performance of the model. Experimental results show that HCRAL performs better than other existing loss functions on the COCO dataset.

Hybrid Classification-Regression Adaptive Loss for Dense Object Detection

Searching Parameterized AP Loss for Object Detection

Decouple and Align Classification and Regression in One-Stage Object Detection

Revisiting the Loss Weight Adjustment in Object Detection

Focal Loss for Dense Object Detection

IoU-Adaptive Deformable R-CNN: Make Full Use of IoU for Multi-Class Object Detection in Remote Sensing Imagery

Adaptive Class Suppression Loss for Long-Tail Object Detection

Loss Function Discovery for Object Detection Via Convergence-Simulation Driven Search

Decoupled Classification Refinement: Hard False Positive Suppression for Object Detection

Feature Pyramid Reconfiguration with Consistent Loss for Object Detection

Near-duplicated Loss for Accurate Object Localization

Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection

An Effective and Lightweight Hybrid Network for Object Detection in Remote Sensing Images

Improving Oriented Object Detection by Scene Classification and Task-Aligned Focal Loss

RESC: REfine the SCore with Adaptive Transformer Head for End-to-end Object Detection

Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training

A Refined Hybrid Network for Object Detection in Aerial Images

A Semantic Consistency Feature Alignment Object Detection Model Based on Mixed-Class Distribution Metrics

Disentangle Your Dense Object Detector

Dense Object Detection Based on De-Homogenized Queries