Abstract:During the process of driving, humans usually rely on multiple senses to gather information and make decisions. Analogously, in order to achieve embodied intelligence in autonomous driving, it is essential to integrate multidimensional sensory information in order to facilitate interaction with the environment. However, the current multi-modal fusion sensing schemes often neglect these additional sensory inputs, hindering the realization of fully autonomous driving. This paper considers multi-sensory information and proposes a multi-modal interactive perception dataset named MIPD, enabling expanding the current autonomous driving algorithm framework, for supporting the research on embodied intelligent driving. In addition to the conventional camera, lidar, and 4D radar data, our dataset incorporates multiple sensor inputs including sound, light intensity, vibration intensity and vehicle speed to enrich the dataset comprehensiveness. Comprising 126 consecutive sequences, many exceeding twenty seconds, MIPD features over 8,500 meticulously synchronized and annotated frames. Moreover, it encompasses many challenging scenarios, covering various road and lighting conditions. The dataset has undergone thorough experimental validation, producing valuable insights for the exploration of next-generation autonomous driving frameworks.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the deficiencies of current multi - modal fusion perception schemes in achieving fully autonomous driving. Specifically, existing multi - modal datasets often overlook additional sensory inputs, such as sound, light intensity, vibration intensity, and vehicle speed, etc., which limits the comprehensive understanding and adaptability of autonomous driving systems to complex environments. Therefore, the paper proposes a new multi - modal interactive perception dataset (ParallelBody), aiming to enhance the perception ability of autonomous driving systems by integrating multiple sensor data. ### Main problems in the paper: 1. **Limitations of multi - modal datasets**: Existing datasets (such as Kitti, NuScenes, Waymo, Argoverse, etc.) provide rich visual data, but are deficient in multi - dimensional perception, especially when dealing with complex environmental changes (such as lighting conditions, road surface conditions, etc.). 2. **Incompleteness of environmental perception**: Existing autonomous driving systems mainly rely on traditional sensors such as cameras, lidars, and radars when perceiving the environment, lacking comprehensive consideration of multi - sensory information such as sound, light intensity, and vibration. 3. **Adaptability to dynamic environments**: Autonomous driving systems need to make rapid and accurate decisions in dynamically changing environments, and existing datasets and algorithms perform poorly in this regard. ### Solutions: 1. **Construct a multi - modal interactive perception dataset**: The paper proposes a new multi - modal dataset, ParallelBody. This dataset not only contains traditional camera, lidar, and 4D radar data, but also integrates multiple sensor data such as sound, light intensity, vibration intensity, and vehicle speed. 2. **Enrich the content of the dataset**: The dataset contains 126 consecutive sequences, with most sequences exceeding 20 seconds, totaling more than 8,500 carefully synchronized and annotated frames. The dataset covers challenging scenarios under various road and lighting conditions. 3. **Experimental verification**: The effectiveness of the collected dataset is verified through experiments using multiple single - modal and multi - modal related models. ### Main contributions: 1. **Multi - modal dataset**: A brand - new multi - modal dataset is proposed, integrating multiple sensor data, including cameras, point clouds, 4D radars, sounds, vibrations, light intensities, and vehicle speeds, etc., to enhance perception tasks. 2. **Dataset content**: The dataset contains 126 consecutive sequences, each sequence exceeding 20 seconds, with a total of more than 8,500 synchronized and annotated frames, covering challenging scenarios under various road, weather, and lighting conditions. 3. **Experimental verification**: Through experiments with multiple single - modal and multi - modal related models, the effectiveness of the dataset is verified. Through these contributions, the paper aims to promote the development of autonomous driving technology, especially in the research of multi - modal interactive perception and environmental adaptability.

MIPD: A Multi-sensory Interactive Perception Dataset for Embodied Intelligent Driving

OpenMPD: An Open Multimodal Perception Dataset for Autonomous Driving

AIDE: A Vision-Driven Multi-View, Multi-Modal, Multi-Tasking Dataset for Assistive Driving Perception

aiMotive Dataset: A Multimodal Dataset for Robust Autonomous Driving with Long-Range Perception

The Multimodal Driver Monitoring Database: A Naturalistic Corpus to Study Driver Attention

AmodalSynthDrive: A Synthetic Amodal Perception Dataset for Autonomous Driving

A Driver Activity Dataset with Multiple RGB-D Cameras and Mmwave Radars

VTD: Visual and Tactile Database for Driver State and Behavior Perception

IPS300+: a Challenging Multimodal Dataset for Intersection Perception System

Quality control of leucocyte‐depleted platelet concentrates obtained by buffy‐coat method

DMD: A Large-Scale Multi-Modal Driver Monitoring Dataset for Attention and Alertness Analysis

Zenseact Open Dataset: A large-scale and diverse multimodal dataset for autonomous driving

DOLPHINS: Dataset for Collaborative Perception enabled Harmonious and Interconnected Self-driving

End-to-End Multimodal Sensor Dataset Collection Framework for Autonomous Vehicles

MCMSys: Multimodal Data Closed-Loop Management System for Autonomous Driving

IPS300+: a Challenging Multi-Modal Data Sets for Intersection Perception System

M2P2: A Multi-Modal Passive Perception Dataset for Off-Road Mobility in Extreme Low-Light Conditions

Dual Radar: A Multi-modal Dataset with Dual 4D Radar for Autonomous Driving

Dual Radar: A Multi-modal Dataset with Dual 4D Radar for Autononous Driving

Ithaca365: Dataset and Driving Perception under Repeated and Challenging Weather Conditions

IAMCV Multi-Scenario Vehicle Interaction Dataset