A Quantum-Based Attention Mechanism in Scene Text Detection.

Hao Wu,Jun Zhou,Qiong Zhang,Yang Lei,Kun Yu,Wenbo An,Juntao Zhang
DOI: https://doi.org/10.1007/978-981-99-8543-2_1
2024-01-01
Abstract:Attention mechanisms have provided benefits in very many visual tasks, e.g. image classification, object detection, semantic segmentation. However, few attention modules have been proposed specifically for scene text detection. We propose an attention mechanism based on Quantum-State-based Mapping (QSM) that enhances channel and spatial attention, introduces higher-order representations, and mixes contextual information. Our approach includes two attention modules: Quantum-based Convolutional Attention Module (QCAM), a plug-and-play module applicable to pre-trained text detection models; Adaptive Channel Information Transfer Module (ACTM), which replaces feature pyramids and complex networks of DBNet++ with a 35.9% reduction in FLOPs. In CNN-based methods, our QCAM achieves state-of-the-art performance on three benchmarks. Remarkably, when compared to the Transformer-based methods such as FSG, our QCAM remains competitive in F-measure on all benchmarks. Notably, QCAM has a 29.5% reduction in parameters compared to FSG, resulting in a balance between detection accuracy and efficiency. ACTM significantly improves F-measure over DBNet++ on three benchmarks, providing an alternative to feature pyramids in scene text detection. The codes, models and training logs are available at https://github.com/yws-wxs/QCAM .
What problem does this paper attempt to address?