CDTD: A Large-Scale Cross-Domain Benchmark for Instance-Level Image-to-Image Translation and Domain Adaptive Object Detection.

Shen Zhiqiang,Huang Mingyang,Shi Jianping,Liu Zechun,Maheshwari Harsh,Zheng Yutong,Xue Xiangyang,Savvides Marios,Huang Thomas S.
DOI: https://doi.org/10.1007/s11263-020-01394-z
IF: 13.369
2020-01-01
International Journal of Computer Vision
Abstract:Cross-domain visual problems, such as image-to-image translation and domain adaptive object detection, have attracted increasing attentions in the last few years, and also become new rising and challenging directions for the computer vision community. Recently, despite enormous efforts of the field in data collection, there are still few datasets covering the instance-level image-to-image translation and domain adaptive object detection tasks simultaneously. In this work, we introduce a large-scale cross-domain benchmark CDTD (contains 155,529 high-resolution natural images across four different modalities with object bounding box annotations. A summary of the entire dataset is provided in the following sections. Dataset is available at: http://zhiqiangshen.com/projects/INIT/index.html .) for the new instance-level translation and object detection tasks. We provide comprehensive baseline results of the benchmark on both of these two tasks. Moreover, we proposed a novel instance-level image-to-image translation approach called INIT and a gradient detach method for the domain adaptive object detection to harvest and exert dataset’s function of the instance level annotations across different domains.
What problem does this paper attempt to address?