HA-DQS-Net: dynamic query design based on transformer with hollow attention

Hongyi Wang,Di Yan,Yunpeng Li,Limei Song
DOI: https://doi.org/10.1117/1.jei.33.1.013033
IF: 0.829
2024-02-05
Journal of Electronic Imaging
Abstract:A common problem in the field of object detection is that the image features could not be fully expressed. And another issue is that the static query selection in the detection transformer (DETR)-like models cannot adapt well to different datasets due to the fixed number of selected object queries. To solve these problems, hollow attention (HA) and dynamic query selection (DQS) modules were proposed, and a network HA-DQS-Net was further formed. HA integrates specially designed masks into self-attention to better combine channel and spatial directional feature information, thereby learning more complex and comprehensive target features. DQS improves the idea of static query selection in the current DETR-like model by dynamically selecting the number of object queries based on the actual number of targets in the image, which enhances the accuracy of the model. HA-DQS-Net, which combines the advantages of HA and DQS, has a competitive performance in the field of object detection. The excellent detection effectiveness of our viewpoint is validated based on PASVAL VOC and a homemade smoking dataset. It is worth noting that all APs have been improved when HA is applied to different DETR-like models, which improves the universality of the HA module.
engineering, electrical & electronic,optics,imaging science & photographic technology
What problem does this paper attempt to address?