MixTBN: A Fully Test-Time Adaptation Method for Visual Reinforcement Learning on Robotic Manipulation

Zi’ang Liu,Wei Li
DOI: https://doi.org/10.1109/iccasit58768.2023.10351689
2023-01-01
Abstract:Learning adaptive policies that can transfer to unseen environments remains challenging in visual generalizable learning (RL). Existing methods commonly learn a robust policy via data augmentation and domain randomization for better generalization. Limited by the unobservability of the target environment, these methods are unable to utilize reward signals for model fine-tune and adaptatively transfer into new scenarios. In this work, we first investigate how a visual RL agent would benefit from the Test-time Adaptation. Surprisingly, we find that the optimization on the Batch-Normalization layer could significantly improve the generalization of visual RL. Hence, we propose a lightweight test-time adaptation algorithm Mix Test-time Batch Normalization (MixTBN), which adaptively transfer the learnt policy into unseen environments without any additional parameter. By solely mixing the statistics of the Batch Normalization layers, our method achieves a state-of-the-art performance on two robotic manipulation tasks. Extensive ablation experiments demonstrate the effectiveness of each component of our method.
What problem does this paper attempt to address?