Abstract:Bounding box regression is a crucial step in object detection, directly affecting the localization performance of the detected objects. Especially in small object detection, an excellent bounding box regression loss can significantly alleviate the problem of missing small objects. However, there are two major problems with the broad Intersection over Union (IoU) losses, also known as Broad IoU losses (BIoU losses) in bounding box regression: (i) BIoU losses cannot provide more effective fitting information for predicted boxes as they approach the target box, resulting in slow convergence and inaccurate regression results; (ii) most localization loss functions do not fully utilize the spatial information of the target, namely the target's foreground area, during the fitting process. Therefore, this paper proposes the Corner-point and Foreground-area IoU loss (CFIoU loss) function by delving into the potential for bounding box regression losses to overcome these issues. First, we use the normalized corner point distance between the two boxes instead of the normalized center-point distance used in the BIoU losses, which effectively suppresses the problem of BIoU losses degrading to IoU loss when the two boxes are close. Second, we add adaptive target information to the loss function to provide richer target information to optimize the bounding box regression process, especially for small object detection. Finally, we conducted simulation experiments on bounding box regression to validate our hypothesis. At the same time, we conducted quantitative comparisons of the current mainstream BIoU losses and our proposed CFIoU loss on the small object public datasets VisDrone2019 and SODA-D using the latest anchor-based YOLOv5 and anchor-free YOLOv8 object detection algorithms. The experimental results demonstrate that YOLOv5s (+3.12% Recall, +2.73% mAP@0.5, and +1.91% mAP@0.5:0.95) and YOLOv8s (+1.72% Recall and +0.60% mAP@0.5), both incorporating the CFIoU loss, achieved the highest performance improvement on the VisDrone2019 test set. Similarly, YOLOv5s (+6% Recall, +13.08% mAP@0.5, and +14.29% mAP@0.5:0.95) and YOLOv8s (+3.36% Recall, +3.66% mAP@0.5, and +4.05% mAP@0.5:0.95), both incorporating the CFIoU loss, also achieved the highest performance improvement on the SODA-D test set. These results indicate the effectiveness and superiority of the CFIoU loss in small object detection. Additionally, we conducted comparative experiments by fusing the CFIoU loss and the BIoU loss with the SSD algorithm, which is not proficient in small object detection. The experimental results demonstrate that the SSD algorithm incorporating the CFIoU loss achieved the highest improvement in the AP (+5.59%) and AP75 (+5.37%) metrics, indicating that the CFIoU loss can also improve the performance of algorithms that are not proficient in small object detection.

DeIoU:Towards Distinguishable Box Prediction in Densely Packed Object Detection

Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression

Unified-IoU: For High-Quality Object Detection

Dive Deeper into Box for Object Detection

Diag-IoU Loss for Object Detection

Robust Bounding Box Regression for Small Object Detection

IoU-aware Single-stage Object Detector for Accurate Localization

3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection

3D IoU-Net: IoU Guided 3D Object Detector for Point Clouds

UnitBox: An Advanced Object Detection Network

Manhattan-distance IOU loss for fast and accurate bounding box regression and object detection

Fused-IoU Loss: Efficient Learning for Accurate Bounding Box Regression

N-IoU: better IoU-based bounding box regression loss for object detection

Rethinking IoU-based Optimization for Single-stage 3D Object Detection

MPDIoU: A Loss for Efficient and Accurate Bounding Box Regression

Optimization for Anchor-Free Object Detection via Scale-Independent GIoU Loss

Corner-Point and Foreground-Area IoU Loss: Better Localization of Small Objects in Bounding Box Regression

Autonomous intersection over union (IoU) loss: adaptive dynamic non-monotonic focal IoU loss

Inner-IoU: More Effective Intersection over Union Loss with Auxiliary Bounding Box

Enhancing Geometric Factors in Model Learning and Inference for Object Detection and Instance Segmentation