Multi-modal Perception Dataset of In-water Objects for Autonomous Surface Vehicles

Mingi Jeong,Arihant Chadda,Ziang Ren,Luyang Zhao,Haowen Liu,Monika Roznere,Aiwei Zhang,Yitao Jiang,Sabriel Achong,Samuel Lensgraf,Alberto Quattrini Li
2024-04-29
Abstract:This paper introduces the first publicly accessible multi-modal perception dataset for autonomous maritime navigation, focusing on in-water obstacles within the aquatic environment to enhance situational awareness for Autonomous Surface Vehicles (ASVs). This dataset, consisting of diverse objects encountered under varying environmental conditions, aims to bridge the research gap in marine robotics by providing a multi-modal, annotated, and ego-centric perception dataset, for object detection and classification. We also show the applicability of the proposed dataset's framework using deep learning-based open-source perception algorithms that have shown success. We expect that our dataset will contribute to development of the marine autonomy pipeline and marine (field) robotics. Please note this is a work-in-progress paper about our on-going research that we plan to release in full via future publication.
Robotics,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper introduces an upcoming multimodal perception dataset for Autonomous Surface Vehicles (ASVs) for underwater obstacle recognition to enhance their situational awareness in water environments. Currently, a major limitation in marine robotics research is the lack of relevant multimodal perception data. Therefore, the objective of the paper is to create the first multimodal, annotated, and first-person perspective perception dataset to facilitate object detection and classification for ASVs. The dataset includes a variety of underwater objects under different environmental conditions and aims to fill the research gap by providing high-quality data for algorithm training and evaluation. In comparison to datasets for land autonomous vehicles, datasets for water environments are relatively scarce and often only contain single modality (e.g., images) or lack cross-modal object labels. The paper points out that this lack of multimodal and precisely annotated data hinders the development of crucial ASV capabilities such as perception and obstacle avoidance, which rely on supervised deep learning methods. To address this challenge, the proposed dataset in the paper includes voyage data from multiple locations (United States, Barbados, South Korea) from 2021 to 2024, covering different environments (seawater and freshwater), weather conditions, and encounter scenarios. The dataset consists of three annotated object classes (boats, buoys, and others) and provides time-synchronized lidar point clouds and RGB images. Additionally, the paper demonstrates the applicability of the framework by utilizing successful open-source deep learning perception algorithms. The paper also discusses sensor configurations for data collection, data acquisition processes, annotation procedures, and characteristics of the dataset, including quantifiable metrics of difficulty and complexity. Finally, the paper showcases the potential of the dataset for object detection and classification tasks through benchmark tests such as YOLOv5 and PointPillar and plans to further expand the dataset to cover more weather conditions and sensor types to enhance robustness.