Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions
Yevgen Chebotar,Quan Vuong,Alex Irpan,Karol Hausman,Fei Xia,Yao Lu,Aviral Kumar,Tianhe Yu,Alexander Herzog,Karl Pertsch,Keerthana Gopalakrishnan,Julian Ibarz,Ofir Nachum,Sumedh Sontakke,Grecia Salazar,Huong T Tran,Jodilyn Peralta,Clayton Tan,Deeksha Manjunath,Jaspiar Singht,Brianna Zitkovich,Tomas Jackson,Kanishka Rao,Chelsea Finn,Sergey Levine
2023-10-17
Abstract:In this work, we present a scalable reinforcement learning method for training multi-task policies from large offline datasets that can leverage both human demonstrations and autonomously collected data. Our method uses a Transformer to provide a scalable representation for Q-functions trained via offline temporal difference backups. We therefore refer to the method as Q-Transformer. By discretizing each action dimension and representing the Q-value of each action dimension as separate tokens, we can apply effective high-capacity sequence modeling techniques for Q-learning. We present several design decisions that enable good performance with offline RL training, and show that Q-Transformer outperforms prior offline RL algorithms and imitation learning techniques on a large diverse real-world robotic manipulation task suite. The project's website and videos can be found at <a class="link-external link-https" href="https://qtransformer.github.io" rel="external noopener nofollow">this https URL</a>
Robotics,Artificial Intelligence,Machine Learning