Multi-dataset Detection with Transformers

Bo Ke,Ruizhi Qiao,Xing Sun
DOI: https://doi.org/10.1007/s11263-024-01985-0
IF: 13.369
2024-01-31
International Journal of Computer Vision
Abstract:Learning a unified model from multiple datasets is very challenging. In this paper, we propose a multi-dataset detector using the transformer (MDT). To enhance the effectiveness of the fusion of multiple datasets, we propose alternative learning to suppress the noisy data. To speed up the training of big data, we use scale shifting to save computational effort. Experiments on OpenImages, COCO, and Mapillary datasets show that our approach can significantly accelerate training while improving performance on multiple datasets. In the Robust Vision Challenge 2022, our solution won 1st place on the object detection track.
computer science, artificial intelligence
What problem does this paper attempt to address?