Hybrid Classification-Regression Adaptive Loss for Dense Object Detection

Yanquan Huang,Liu Wei Zhen,Yun Hao,Mengyuan Zhang,Qingyao Wu,Zikun Deng,Xueming Liu,Hong Deng
2024-08-30
Abstract:For object detection detectors, enhancing model performance hinges on the ability to simultaneously consider inconsistencies across tasks and focus on difficult-to-train samples. Achieving this necessitates incorporating information from both the classification and regression tasks. However, prior work tends to either emphasize difficult-to-train samples within their respective tasks or simply compute classification scores with IoU, often leading to suboptimal model performance. In this paper, we propose a Hybrid Classification-Regression Adaptive Loss, termed as HCRAL. Specifically, we introduce the Residual of Classification and IoU (RCI) module for cross-task supervision, addressing task inconsistencies, and the Conditioning Factor (CF) to focus on difficult-to-train samples within each task. Furthermore, we introduce a new strategy named Expanded Adaptive Training Sample Selection (EATSS) to provide additional samples that exhibit classification and regression inconsistencies. To validate the effectiveness of the proposed method, we conduct extensive experiments on COCO test-dev. Experimental evaluations demonstrate the superiority of our approachs. Additionally, we designed experiments by separately combining the classification and regression loss with regular loss functions in popular one-stage models, demonstrating improved performance.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are the inconsistency between classification and regression tasks in existing object detection models and the problem of samples that are difficult to train. Specifically: 1. **Inconsistency between classification and regression tasks**: In the design of existing loss functions, classification and regression tasks are usually processed separately, which leads to a lack of effective synergy between the two tasks. For example, some methods enhance performance by using IoU (Intersection over Union) as a classification label, but these methods fail to effectively focus on truly difficult - to - train samples, especially when dealing with samples with similar IoU values. 2. **Problem of difficult - to - train samples**: Existing methods often only emphasize difficult samples in either the classification or regression task, ignoring comprehensive information across tasks. For example, methods such as Focal Loss and GHM Loss mainly focus on difficult samples in the classification task, while Focal EIoU and Alpha IoU focus on difficult samples in the regression task. However, none of these methods fully consider the consistency between classification and regression tasks. To solve these problems, the paper proposes a new loss function - **Hybrid Classification - Regression Adaptive Loss (HCRAL)**, and improves model performance by introducing the following modules: - **RCI (Residual of Classification and IoU) module**: Used for cross - task supervision to solve the inconsistency between classification and regression tasks. The formula is expressed as: \[ RCI = s - iou+\alpha \] where \(s\) is the predicted classification score, \(iou\) is the position of the predicted bounding box and the real bounding box, and \(\alpha\) is an adjustment parameter. - **CF (Conditioning Factor) module**: Used to focus on difficult - to - train samples in each task. For the classification loss function, CF is defined as: \[ CF_{cls}(i)=\omega_i\times\beta_i \] where \(\omega_i\) is an adaptive matrix and \(\beta_i\) is the sample weight. For the regression loss function, CF is defined as: \[ CF_{reg}=t\cdot e^R\cdot IoU \] In addition, the paper also proposes a new positive and negative sample selection strategy - **Expanded Adaptive Training Sample Selection (EATSS)** to provide more optimized samples, especially those with high IoU or high classification scores. Through these improvements, HCRAL can better balance classification and regression tasks and effectively focus on difficult - to - train samples, thereby improving the overall performance of the model. Experimental results show that HCRAL performs better than other existing loss functions on the COCO dataset.