Dual Selection Network for Video Object Detection

Tianxiang Hou,Qiang Qi,Yang Lu,Kaiwen Du,Hanzi Wang
DOI: https://doi.org/10.1109/icme52920.2022.9859947
2022-01-01
Abstract:Some off-the-shelf video object detection methods usually enhance the degraded proposal features of target frames by aggregating the proposal features from support frames. However, the proposals generated by region proposal network may not be accurate, resulting in inaccurate proposal features and limited performance. To mitigate this, we propose a novel dual selection network (DSNet) for video object detection, which contains two successive stages: selecting proposals that fit objects more closely, and selecting proposal features that are more conducive to feature aggregation. Correspondingly, the proposal selection module (PSM) aims to select better proposals by exploiting their boundary information, and the selective aggregation module (SAM) aims to select better proposal features for aggregation. Consequently, DSNet can generate more robust proposal features through the novel dual selection mechanism implemented by PSM and SAM. Extensive experiments show that our DSNet obtains 83.7% mAP and achieves superior performance over several state-of-the-art methods.
What problem does this paper attempt to address?