Focus and Adjust: Progressive Refinement Network for Human Object Interaction Detection

Baixiang Yang,Wei Gao,Ge Li
DOI: https://doi.org/10.1109/icpr56361.2022.9956111
2022-01-01
Abstract:Many recent works model the Human Object Interaction detection process as a set prediction problem with the help of transformer architecture while demonstrating promising performance. However, previous transformer-based detectors suffer from the variable scale of instances involved in HOI. Moreover, the naive transformer provides a global receptive field for the query to search for contextual information, which also introduces potential redundant information for the corresponding human-object pair. In this paper, we propose PR-Net, a Progressive Refinement Network equipped with two designed refinement modules to tackle the problems above, which provides a coarse-to-fine framework for Human Object Interaction detection. In addition, the refinement modules are organized in a localization-interaction mutual-guided manner to exploit the benefits of the co-optimization between the instance localization and interaction classification and promote HOI detection performance. Our proposed method achieves competitive performance in two HOI detection benchmarks and extensive experiments demonstrate its effectiveness.
What problem does this paper attempt to address?