Cross-domain Constrained Network for Zero-shot Object Detection

Wen Lv,Hongbo Shi,Shuai Tan,Bing Song,Yang Tao
DOI: https://doi.org/10.21203/rs.3.rs-2208626/v1
2022-01-01
Abstract:Abstract Zero-shot object detection (ZSD) aims to recognize and locate unseen objects without training samples. Despite the rapid progress in the ZSD technique, most existing algorithms learn a mapping relationship between visual features and semantic space based on seen classes to apply the unseen classes without considering unseen class possible connections in the inference. Besides, due to the heterogeneous characteristics (category attributes and distribution variations) in different domains, the network’s mapping relationship alone between visual and semantic data will lead to inference bias. To address the above issues, we propose a novel cross-domain constrained network for ZSD, named constraint-ZSD. Specifically, we built a target association strategy to associate seen instances to learn unseen classes, which guarantees that at least one unseen object is associated with a similar seen object during the training period. Then, we model a cross-domain constrained mechanism to set a relational knowledge graph and explicitly constrain the consistency of category correlation under visual and semantic domains. As a result, the detector is guided to learn object-related and domain-independent feature representations. Finally, extensive experiments are presented for two standard benchmarks of ZSD, PASCAL VOC and MSCOCO. Our method outperforms previous state-of-the-art techniques on ZSD and generalized ZSD (GZSD) tasks.
What problem does this paper attempt to address?