Abstract:Existing pseudo label generation methods for point weakly supervised object detection are inadequate in low data volume and dense object detection tasks. We consider the generation of weakly supervised pseudo labels as the model's sparse output, and propose Sparse Generation as a solution to make pseudo labels sparse. The method employs three processing stages (Mapping, Mask, Regression), constructs dense tensors through the relationship between data and detector model, optimizes three of its parameters, and obtains a sparse tensor, thereby indirectly obtaining higher quality pseudo labels, and addresses the model's density problem on low data volume. Additionally, we propose perspective-based matching, which provides more rational pseudo boxes for prediction missed on instances. In comparison to the SOTA method, on four datasets (MS COCO-val, RSOD, SIMD, Bullet-Hole), the experimental results demonstrated a significant advantage.
What problem does this paper attempt to address?
This paper attempts to solve the problems existing in the generation of pseudo - labels in Point Weakly Supervised Object Detection (PWSOD) in low - data - volume and dense object detection tasks. Specifically:
1. **Deficiencies of existing methods**:
- Existing pseudo - label generation methods perform poorly in low - data - volume and dense object detection tasks.
- Since it is impossible to directly obtain bounding boxes from weakly - supervised annotation data, it usually needs to rely on additional networks or the detector itself to generate labels, which will lead to a decline in model performance, especially in the case of low - data - volume.
- The number of generated pseudo - labels far exceeds the actual number of instances in the image, especially in low - data - volume and dense - instance - detection tasks.
2. **Local - focusing problem**:
- When using an additional network to generate pseudo - labels, the output results are often dense sets and are prone to local - focusing problems, manifested as overlapping bounding boxes and missed - detection instances.
- After repeated optimization of these inaccurate subsets, the potential of the algorithm may not be fully exploited.
3. **Lack of direct regression to pseudo - labels**:
- Some previous works use matching/filtering mechanisms to select or assign pseudo - labels, but these methods lack direct learning and regression of pseudo - labels and cannot fully utilize the global information in all prediction results.
To solve the above problems, the paper proposes the Sparse Generation method, aiming to improve the quality of pseudo - labels in the following ways:
- **Sparsifying pseudo - labels**: Through three processing stages (mapping, masking, regression), the dense pseudo - labels are converted into sparse outputs, thereby improving the quality of pseudo - labels.
- **Reducing local - focusing**: By constructing a dense tensor graph covering the entire instance, the local - focusing problem is reduced.
- **Direct regression**: By directly regressing the pseudo - labels, more reasonable predictions are generated using limited data.
In addition, the paper also proposes a perspective - based matching method to provide more reasonable pseudo - bounding boxes for predicting missing instances. Experimental results show that Sparse Generation significantly outperforms the existing state - of - the - art methods on multiple datasets.
### Formula summary
1. **Step function in the mapping stage**:
\[
S(x_i, y_i, l_i) =
\begin{cases}
0 & \text{if } d_i / l_i > 1 \\
0.1 & \text{if } 0.75 < d_i / l_i \leq 1 \\
0.3 & \text{if } 0.5 < d_i / l_i \leq 0.75 \\
0.6 & \text{if } 0.25 < d_i / l_i \leq 0.5 \\
0.8 & \text{if } 0 < d_i / l_i \leq 0.25 \\
1 & \text{if } d_i = 0
\end{cases}
\]
where \( d_i=\sqrt{(x_i - x_c)^2+(y_i - y_c)^2} \), \((x_c, y_c)\) is the center position of the pseudo - bounding box.
2. **Definition of two - dimensional tensor**:
\[
I_T =
\begin{cases}
\{S(x_i, y_i, l_i)\mid l_i = w_i\}, & \text{if } w_i\leq h_i \\
\{S(x_i, y_i, l_i)\mid l_i = h_i\}, & \text{if } w_i > h_i
\end{cases}
\]
3. **Calculation of heat - distribution tensor**:
\[
S_T=\sum_{i = 0}^{n}I_T'
\]
4. **Application of mask tensor**:
\[
A_{MT}=M