Multi-Label and Evolvable Dataset Preparation for Web-Based Object Detection

Shucheng Li,Jingzhou Zhu,Boyu Chang,Hao Wu,Fengyuan Xu,Sheng Zhong
DOI: https://doi.org/10.1145/3695465
IF: 4.157
2024-01-01
ACM Transactions on Knowledge Discovery from Data
Abstract:In this paper, we focus on the emerging field of web-based object detection, which has gained considerable attention due to its ability to utilize large amounts of web data for training, thus eliminating the need for labor-intensive manual annotations. However, the noisy and ever-evolving nature of web data poses challenges in preparing high-quality datasets for web-based object detection. To address these challenges, we propose a fully automatic dataset preparation method in this paper. Our proposed method incorporates a hierarchical clustering module that assigns multiple precise labels to each image. This module is based on our observation that web image data exhibits different distributions at varying granularities. Furthermore, an evolutionary relabeling module ensures the adaptability of both the prepared dataset and trained detection models to the ever-evolving web data. Extensive experiments demonstrate that our method outperforms other web-based methods, and achieves a comparable performance to those manually labeled benchmark datasets.
What problem does this paper attempt to address?