Instance-Dictionary Learning for Open-World Object Detection in Autonomous Driving Scenarios

Zeyu Ma,Ziqiang Zheng,Jiwei Wei,Yang Yang,Heng Tao Shen
DOI: https://doi.org/10.1109/tcsvt.2023.3322465
IF: 5.859
2023-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:This paper addresses an important and valuable open-world object detection (OWOD) in autonomous driving scenarios, which aims to detect objects under both domain-agnostic and category-agnostic settings simultaneously. Existing OWOD algorithms mainly focus on the detection of pre-defined object categories under various conditions (domain-agnostic) or instead perform zero-shot object detection (category-agnostic), separately. The knowledge gap between seen and unseen object categories poses challenges for models optimized with supervision from the only seen object categories. The domain difference across different scenarios also causes further challenges in aligning observations with different appearances. To address these two challenges simultaneously, we propose our Instance Dictionary Learning (IDL for short) for more robust and accurate OWOD performance. We first design a pre-training procedure to build up the mappings between region features and category semantic embeddings by introducing instance contrastive learning. The joint vision-semantic space is formulated through the more detailed instance-level “Dictionary”, which expresses the region-category correspondences and helps link the seen and unseen object categories. The domain discrimination is further designed for extracting the domain invariance feature representations in the further training procedure seamlessly. The proposed IDL could detect the unseen categories from unseen domains without any bounding box annotations while there is no obvious performance drop on detecting seen categories meanwhile. Comprehensive experiments have been conducted and our method could achieve a new state-of-the-art OWOD performance over previous algorithms.
engineering, electrical & electronic
What problem does this paper attempt to address?