SNeRL: Semantic-aware Neural Radiance Fields for Reinforcement Learning

Dongseok Shim,Seungjae Lee,H. Jin Kim
DOI: https://doi.org/10.48550/arXiv.2301.11520
2023-06-01
Abstract:As previous representations for reinforcement learning cannot effectively incorporate a human-intuitive understanding of the 3D environment, they usually suffer from sub-optimal performances. In this paper, we present Semantic-aware Neural Radiance Fields for Reinforcement Learning (SNeRL), which jointly optimizes semantic-aware neural radiance fields (NeRF) with a convolutional encoder to learn 3D-aware neural implicit representation from multi-view images. We introduce 3D semantic and distilled feature fields in parallel to the RGB radiance fields in NeRF to learn semantic and object-centric representation for reinforcement learning. SNeRL outperforms not only previous pixel-based representations but also recent 3D-aware representations both in model-free and model-based reinforcement learning.
Machine Learning,Artificial Intelligence,Computer Vision and Pattern Recognition,Robotics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in Reinforcement Learning (RL), the existing representation methods cannot effectively incorporate the 3D environmental information intuitively understood by humans into the model, resulting in sub - optimal performance. Specifically, traditional image - based Reinforcement Learning methods can usually only learn visual representations from single - view observations, lack an understanding of 3D structural information, and it is difficult to obtain object - related semantic representations. To solve these problems, the paper proposes Semantic - aware Neural Radiance Fields for Reinforcement Learning (SNeRL), aiming to learn 3D - aware neural implicit representations from multi - view images by combining convolutional encoders and semantic - aware Neural Radiance Fields (NeRF). SNeRL introduces a 3D semantic field and a distilled feature field in parallel with the RGB radiation field to learn semantic and object - centered representations suitable for Reinforcement Learning tasks. The following are the main contributions of SNeRL: 1. **Proposing a new framework**: SNeRL uses NeRF together with semantic and distilled feature fields to learn 3D - aware semantic representations, thereby improving the effectiveness of Reinforcement Learning. 2. **Verifying effectiveness**: SNeRL not only performs well in both model - free and model - based methods, but is also the first work to utilize semantic - aware representations without using object masks in RL downstream tasks. 3. **Outperforming existing methods**: SNeRL outperforms previous single - view and multi - view image - based RL algorithms in four different 3D environments, especially in the Meta - world environment. Through these improvements, SNeRL can learn more efficiently in complex control tasks and better understand the semantic information in 3D environments.