Gradient Preserving Batch Normalization for Test-Time Adaptation

Tao Yang,Yuwang Wang,Yan Lu,Nanning Zheng
DOI: https://doi.org/10.2139/ssrn.4627328
2023-01-01
Abstract:Deep neural networks often suffer the data distribution shift between training and testing, and the batch statistics are observed to reflect the shift. In this paper, targeting alleviating distribution shift in test time, we revisit test-time batch normalization (BN) and reveal an important insight benefiting test-time adaptation: preserving the same gradient backpropagation form as training. Based on this insight, we propose a novel test-time BN layer, dubbed GpreBN, which is optimized during testing by minimizing Entropy loss. We verify the effectiveness of our method in two typical settings with distribution shift, i.e., domain generalization and robustness tasks. Our GpreBN significantly improves the test-time performance and achieves state-of-the-art results.
What problem does this paper attempt to address?