Reparameterized Attention for Convolutional Neural Networks

Yiming Wu,Ruixiang Li,Yunlong Yu,Xi Li
DOI: https://doi.org/10.1016/j.patrec.2022.10.022
IF: 4.757
2022-01-01
Pattern Recognition Letters
Abstract:The attention mechanism has been widely explored for neural networks as it could effectively model the interdependencies among channels, spatial positions, and frames. A neural network with attention modules has uncertainties in its parameters, but training the models deterministically hardly captures the uncertainties. Modeling the parameters' uncertainty of the attention module could facilitate flexibly capturing the representative patterns, thus promoting the generalization of the models. In this work, we propose a novel reparameterized attention strategy by modeling the uncertainty of the parameters in the attention module and performing uncertainty-aware optimization. Instead of learning deterministic parameters for the attention modules, our strategy learns variational posterior distributions. The experi-mental results show that our strategy could consistently improve different models' accuracy and reduce the generalization gap without extra computation.(c) 2022 Published by Elsevier B.V.
What problem does this paper attempt to address?