PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection

Botao Ren,Xue Yang,Yi Yu,Junwei Luo,Zhidong Deng
2024-10-11
Abstract:Single point supervised oriented object detection has gained attention and made initial progress within the community. Diverse from those approaches relying on one-shot samples or powerful pretrained models (e.g. SAM), PointOBB has shown promise due to its prior-free feature. In this paper, we propose PointOBB-v2, a simpler, faster, and stronger method to generate pseudo rotated boxes from points without relying on any other prior. Specifically, we first generate a Class Probability Map (CPM) by training the network with non-uniform positive and negative sampling. We show that the CPM is able to learn the approximate object regions and their contours. Then, Principal Component Analysis (PCA) is applied to accurately estimate the orientation and the boundary of objects. By further incorporating a separation mechanism, we resolve the confusion caused by the overlapping on the CPM, enabling its operation in high-density scenarios. Extensive comparisons demonstrate that our method achieves a training speed 15.58x faster and an accuracy improvement of 11.60%/25.15%/21.19% on the DOTA-v1.0/v1.5/v2.0 datasets compared to the previous state-of-the-art, PointOBB. This significantly advances the cutting edge of single point supervised oriented detection in the modular track.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the **rotated object detection problem under single - point supervision**, especially the efficiency and accuracy when dealing with small and densely - arranged objects. Specifically, the authors pointed out some limitations of existing methods (such as PointOBB): 1. **Slow speed**: The pseudo - label generation process is very time - consuming, which is about 7 - 8 times slower than the subsequent detector training. 2. **High memory consumption**: Due to the need for multiple view transformations, a large amount of GPU memory is required during the training process, and especially in high - density object scenarios, the problem of insufficient memory is likely to occur. 3. **Poor flexibility**: It depends on predefined prior knowledge, resulting in limited generalization ability and difficulty in adapting to different datasets. To solve these problems, the authors proposed **PointOBB - v2**, aiming to design a simpler, faster, and more powerful method to improve the efficiency and accuracy of rotated object detection, especially in small and dense object scenarios. By introducing a new class probability map (CPM) generation method, principal component analysis (PCA), and a separation mechanism, PointOBB - v2 significantly improves the speed and accuracy of pseudo - label generation and reduces memory usage. ### Specific improvement points: 1. **Simplify the process**: Remove the traditional teacher - student structure, making the whole process more concise and efficient. 2. **Generate pseudo - labels quickly**: Through non - uniform sampling and PCA, quickly and accurately estimate the orientation and boundaries of objects. 3. **Reduce memory consumption**: Avoid multiple view transformations and consistency constraints, so that the model can run on most GPUs without the problem of insufficient memory. 4. **Handle dense objects**: Introduce a separation mechanism to solve the problem of object confusion in high - density scenarios. ### Experimental results: The experimental results show that PointOBB - v2 has achieved significant performance improvements on multiple datasets (such as DOTA - v1.0, v1.5, v2.0), especially a significant improvement in the mAP metric compared to PointOBB. For example, on the DOTA - v1.0 dataset, when using Rotated FCOS as a detector, the mAP of PointOBB - v2 has increased from 30.08% to 41.68%, an increase of 11.60%. In addition, the speed of pseudo - label generation has also increased by 15.58 times, from 22.28 hours to 1.43 hours. In general, through a series of innovative improvements, PointOBB - v2 significantly improves the efficiency and accuracy of rotated object detection under single - point supervision, and is especially suitable for dealing with small and densely - arranged objects.