Abstract:This paper introduces the point-axis representation for oriented object detection, emphasizing its flexibility and geometrically intuitive nature with two key components: points and axes. 1) Points delineate the spatial extent and contours of objects, providing detailed shape descriptions. 2) Axes define the primary directionalities of objects, providing essential orientation cues crucial for precise detection. The point-axis representation decouples location and rotation, addressing the loss discontinuity issues commonly encountered in traditional bounding box-based approaches. For effective optimization without introducing additional annotations, we propose the max-projection loss to supervise point set learning and the cross-axis loss for robust axis representation learning. Further, leveraging this representation, we present the Oriented DETR model, seamlessly integrating the DETR framework for precise point-axis prediction and end-to-end detection. Experimental results demonstrate significant performance improvements in oriented object detection tasks.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the loss discontinuity problem in **Oriented Object Detection**. Specifically, in traditional methods based on rotated bounding boxes, when dealing with non - axis - aligned objects, due to sudden changes in rotation angles or width - height definitions, the loss function becomes discontinuous, thus affecting the learning stability and detection performance of the model. ### Problem Background 1. **Limitations of Traditional Methods**: - **Rotated Bounding Box Representation**: Although it can flexibly represent objects in any direction, in some cases (such as when the length and width are nearly equal), the angle θ will switch between θ and θ±90°, resulting in discontinuity of the loss function. - **Quadrilateral Representation**: The rotated bounding box is defined by the circumscribed horizontal box and the offsets of four vertices. However, when the object is close to the horizontal direction, the vertex regression order becomes ambiguous. - **Point - Set Representation**: Although it can capture the detailed position of the target, it often ignores the main directionality of the object, making it difficult to accurately detect objects with complex shapes. 2. **Challenges of Existing Methods**: - **Loss Discontinuity**: Due to the angle periodicity problem, traditional methods will have angle jumps in some cases, resulting in difficult optimization. - **Representation Ambiguity**: Some methods have ambiguity in defining boundaries, especially when dealing with approximately circular or square objects. ### Solutions Proposed in the Paper To solve the above problems, this paper introduces a new **Point - Axis Representation**, whose core idea is to decouple the position and direction of the object, thus avoiding the loss discontinuity problem in traditional methods. Specifically: 1. **Advantages of Point - Axis Representation**: - **Points for Shape Description**: Points are used to describe the spatial extent and contour of the object, providing a detailed shape representation, especially suitable for irregular - shaped objects. - **Axes for Direction Hints**: Axes are used to define the main directionality of the object, providing key direction information, which is helpful for accurate detection. 2. **Innovative Loss Functions**: - **Max - Projection Loss**: It supervises point - set learning and promotes object convergence without explicit joint - point annotations. - **Cross - Axis Loss**: By discretizing the angle and applying smoothing processing, it generates four - peak label encoding, enhancing the robustness of the axis representation. 3. **Model Architecture**: - **Oriented DETR Model**: Combined with the DETR framework, it introduces conditional point queries and a point - detection decoder, captures the relationships between points through a multi - layer self - attention mechanism, and performs iterative refinement. ### Experimental Results The experimental results show that this method significantly improves the performance of the oriented object - detection task on multiple datasets, especially when dealing with objects with complex shapes and directions. In conclusion, this paper effectively solves the loss discontinuity problem in oriented object detection and improves the detection accuracy and robustness by introducing the point - axis representation and the corresponding loss functions.

Projecting Points to Axes: Oriented Object Detection via Point-Axis Representation

Oriented RepPoints for Aerial Object Detection

ACE: Anchor-Free Corner Evolution for Real-Time Arbitrarily-Oriented Object Detection

On Improving Bounding Box Representations for Oriented Object Detection

Category-Aware Dynamic Label Assignment with High-Quality Oriented Proposal

Bounding Box Projection for Regression Uncertainty in Oriented Object Detection

PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection

PointOBB: Learning Oriented Object Detection via Single Point Supervision

Oriented objects as pairs of middle lines

Boosting Point Set-Based Network with Optimal Transport Optimization for Oriented Object Detection

Highly Efficient Anchor-Free Oriented Small Object Detection for Remote Sensing Images via Periodic Pseudo-Domain

R²IPoints: Pursuing Rotation-Insensitive Point Representation for Aerial Object Detection

D2Q-DETR: Decoupling and Dynamic Queries for Oriented Object Detection with Transformers

Oriented Object Detection via Contextual Dependence Mining and Penalty-Incentive Allocation

Trigonometric-Coded Refined Detector for High Precision Oriented Object Detection

Structure Tensor Representation for Robust Oriented Object Detection

Polar Ray: A Single-stage Angle-free Detector for Oriented Object Detection in Aerial Images.

Dual-Aligned Oriented Detector

ARS-DETR: Aspect Ratio-Sensitive Detection Transformer for Aerial Oriented Object Detection

Coupled Dual-Frequency Phase-Shifting Coder for Precise Rotated Angle Representation in Oriented Object Detection

Adaptive Period Embedding for Representing Oriented Objects in Aerial Images.