A Transform module to enhance lightweight attention by expanding receptive field

Cheng Chen,Bo Li
DOI: https://doi.org/10.1016/j.eswa.2024.123359
IF: 8.5
2024-02-14
Expert Systems with Applications
Abstract:Long-range dependence modeling has been demonstrated to be an effective technique for boosting the performance of channelwise attention. As a state-of-the-art attention mechanism and a lightweight model, coordinate attention (CA) captures long-range dependencies in its context modeling phase. However, its subsequent transformation phase can only capture interchannel dependencies. This leads to a linear receptive field in overall dependency modeling scenarios. To further expand the receptive field of CA, a novel attention mechanism called patch-enhanced attention (PEA) is proposed in this paper. To redesign the transformation phase, a group of unfolding and folding operations are embedded into the thin waist of the hourglass structure. In this way, the receptive field range for coordinate orientations is expanded from a line to a patch. Extensive experiments conducted on the ImageNet and Pascal VOC benchmarks validate the effectiveness of the proposed PEA mechanism. Compared with CA, PEA achieves state-of-the-art performance with fewer parameters.
computer science, artificial intelligence,engineering, electrical & electronic,operations research & management science
What problem does this paper attempt to address?