Abstract:Optical sensors and learning algorithms for autonomous vehicles have dramatically advanced in the past few years. Nonetheless, the reliability of today's autonomous vehicles is hindered by the limited line-of-sight sensing capability and the brittleness of data-driven methods in handling extreme situations. With recent developments of telecommunication technologies, cooperative perception with vehicle-to-vehicle communications has become a promising paradigm to enhance autonomous driving in dangerous or emergency situations. We introduce COOPERNAUT, an end-to-end learning model that uses cross-vehicle perception for vision-based cooperative driving. Our model encodes LiDAR information into compact point-based representations that can be transmitted as messages between vehicles via realistic wireless channels. To evaluate our model, we develop AutoCastSim, a network-augmented driving simulation framework with example accident-prone scenarios. Our experiments on AutoCastSim suggest that our cooperative perception driving models lead to a 40% improvement in average success rate over egocentric driving models in these challenging driving situations and a 5 times smaller bandwidth requirement than prior work V2VNet. COOPERNAUT and AUTOCASTSIM are available at <a class="link-external link-https" href="https://ut-austin-rpl.github.io/Coopernaut/" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The paper attempts to address the issue that existing learning-based driving strategies in autonomous vehicles are still not robust enough when facing extreme situations and rare scenarios, especially when the vehicle's optical sensors (such as stereo cameras and LiDAR) are affected by line-of-sight limitations and adverse weather conditions. Additionally, current methods often rely on the perception capabilities of a single vehicle, which can lead to decision-making errors in complex traffic environments. To address these issues, the paper introduces **COOPERNAUT**, an end-to-end cooperative perception driving model that enhances the safety and reliability of autonomous driving by sharing perception information through communication between vehicles. Specifically, COOPERNAUT uses a Point Transformer to encode LiDAR data into a compact representation and transmits this information between vehicles through actual wireless communication channels. In this way, COOPERNAUT can make more informed driving decisions in situations with limited line-of-sight. The main contributions of the paper include: 1. **COOPERNAUT Model**: An end-to-end cooperative perception driving model that improves the quality of driving decisions by sharing compact perception information through inter-vehicle communication. 2. **AUTOCASTSIM Simulation Framework**: A network-enhanced autonomous driving simulation framework used to evaluate the performance of COOPERNAUT and other baseline models in accident-prone scenarios. 3. **Experimental Results**: Experiments show that COOPERNAUT improves the success rate in challenging driving scenarios by 40% compared to models that rely solely on their own perception, and reduces bandwidth requirements by 5 times compared to the previous V2VNet method. Through these contributions, the paper demonstrates the potential of cooperative perception in enhancing the safety of autonomous driving.

COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked Vehicles