Distributional Reinforcement Learning based Integrated Decision Making and Control for Autonomous Surface Vehicles

Xi Lin,Paul Szenher,Yewei Huang,Brendan Englot
2024-12-13
Abstract:With the growing demands for Autonomous Surface Vehicles (ASVs) in recent years, the number of ASVs being deployed for various maritime missions is expected to increase rapidly in the near future. However, it is still challenging for ASVs to perform sensor-based autonomous navigation in obstacle-filled and congested waterways, where perception errors, closely gathered vehicles and limited maneuvering space near buoys may cause difficulties in following the Convention on the International Regulations for Preventing Collisions at Sea (COLREGs). To address these issues, we propose a novel Distributional Reinforcement Learning based navigation system that can work with onboard LiDAR and odometry sensors to generate arbitrary thrust commands in continuous action space. Comprehensive evaluations of the proposed system in high-fidelity Gazebo simulations show its ability to decide whether to follow COLREGs or take other beneficial actions based on the scenarios encountered, offering superior performance in navigation safety and efficiency compared to systems using state-of-the-art Distributional RL, non-Distributional RL and classical methods.
Robotics
What problem does this paper attempt to address?
This paper attempts to address the challenges encountered by Autonomous Surface Vehicles (ASVs) when performing sensor - based autonomous navigation in complex and crowded waterways. Specific problems include: 1. **Perception Error**: Due to the possible noise or inaccuracy in data from LiDAR and other sensors, leading to偏差 in the understanding of the surrounding environment. 2. **Dense Vessel Aggregation**: In busy waters, multiple ASVs are closely grouped together, increasing the collision risk. 3. **Limited Operating Space**: Especially near static obstacles such as buoys, the operating space for ASVs is very limited. 4. **Compliance with the International Regulations for Preventing Collisions at Sea (COLREGs)**: In the case of multi - ship encounters, how to ensure that the behavior of ASVs complies with these rules while being able to flexibly respond to unexpected situations. To solve these problems, the author proposes a new navigation system based on Distributional Reinforcement Learning (Distributional RL). This system can work in conjunction with on - board LiDAR and odometer sensors to generate arbitrary thrust commands in a continuous action space. High - fidelity Gazebo simulation evaluations show that this system can decide whether to follow COLREGs or take other beneficial actions when encountering different scenarios, thus outperforming existing distributional RL, non - distributional RL, and classical methods in terms of navigation safety and efficiency. ### Main Contributions 1. **Proposed an ASV decision - making and control scheme based on AC - IQN**: This scheme can perform continuous control under wind - wave interference and is suitable for multi - vehicle dense environments. 2. **Designed a new reward function**: This reward function encourages behavior that complies with COLREGs, but also does not penalize other collision - avoidance behaviors that contribute to navigation safety and efficiency. 3. **Extensive Simulation Evaluation**: A large number of experiments were carried out in high - fidelity Gazebo simulations to verify the superior performance of the system. ### Key Technical Points - **Distributional Reinforcement Learning (Distributional RL)**: Improve the robustness of the policy by learning the distribution of cumulative rewards. - **Actor - Critic Structure**: Combined with Implicit Quantile Network (IQN) to achieve efficient control in a continuous action space. - **Perception Processing Module**: Extract object information from LiDAR point - cloud data and estimate its speed and position. Through these improvements, this system can achieve more intelligent and reliable autonomous navigation in complex marine environments.