HOD: Human-Object Decoupling Network for HOI Detection

Hantao Zhang,Shouhong Wan,Weidong Guo,Peiquan Jin
DOI: https://doi.org/10.1109/ICME55011.2023.00379
2023-01-01
Abstract:Single-stage Human-Object Interaction (HOI) detection methods have attracted considerable attention due to their high efficiency. Existing methods tend to concentrate the detection of humans and objects in one decoder without considering the differences between them, which causes tremendous pressure on a single decoder and affects the detection effect. This paper aims to decouple the detection decoder of humans and objects. In particular, we advocate and propose a novel human-object decoupling network (HOD) that divides the decoder into three tasks: human detection, object detection, and action classification. The network uses the random erasure training strategy to improve the model's generalization ability and introduces pose features to handle the long-tailed problem. In addition, we design a pose fusion branch to alleviate the semantic gap between pose and HOI datasets. The experimental results suggest that our method achieves consistent improvements over the state-of-the-art across different datasets, utilizing only image information. Specifically for HICO-Det, our method outperforms existing methods by a large margin, with a significant relative mAP gain of 7.8%. Our source code will be publicly available upon acceptance.
What problem does this paper attempt to address?