Mobile Robot Oriented Large-Scale Indoor Dataset for Dynamic Scene Understanding

Yifan Tang,Cong Tai,Fangxing Chen,Wanting Zhang,Tao Zhang,Xueping Liu,Yongjin Liu,Long Zeng

2024-07-01

Abstract:Most existing robotic datasets capture static scene data and thus are limited in evaluating robots' dynamic performance. To address this, we present a mobile robot oriented large-scale indoor dataset, denoted as THUD (Tsinghua University Dynamic) robotic dataset, for training and evaluating their dynamic scene understanding algorithms. Specifically, the THUD dataset construction is first detailed, including organization, acquisition, and annotation methods. It comprises both real-world and synthetic data, collected with a real robot platform and a physical simulation platform, respectively. Our current dataset includes 13 larges-scale dynamic scenarios, 90K image frames, 20M 2D/3D bounding boxes of static and dynamic objects, camera poses, and IMU. The dataset is still continuously expanding. Then, the performance of mainstream indoor scene understanding tasks, e.g. 3D object detection, semantic segmentation, and robot relocalization, is evaluated on our THUD dataset. These experiments reveal serious challenges for some robot scene understanding tasks in dynamic scenes. By sharing this dataset, we aim to foster and iterate new mobile robot algorithms quickly for robot actual working dynamic environment, i.e. complex crowded dynamic scenes.

Robotics

What problem does this paper attempt to address?

The paper attempts to address the issue that in existing mobile robot datasets, most datasets primarily capture data from static scenes, which poses limitations in evaluating the dynamic performance of robots. To overcome this limitation, the authors have constructed a large-scale indoor dataset for mobile robots (THUD) to train and evaluate algorithms for understanding dynamic scenes. Specifically, the main contributions of the paper include: 1. **Dataset Construction**: A detailed introduction to the organization, collection, and annotation methods of the THUD dataset. This dataset includes both real-world and synthetic data, collected through real robot platforms and physical simulation platforms, respectively. The current version of the dataset includes 13 large-scale dynamic scenes, 90K image frames, 20M 2D/3D bounding boxes of static and dynamic objects, camera poses, and IMU data. 2. **Application Scenarios**: Evaluation of mainstream indoor scene understanding tasks (such as 3D object detection, semantic segmentation, and robot relocalization) on the THUD dataset. Experimental results reveal significant challenges faced by some robot scene understanding tasks in dynamic scenes. 3. **Dataset Expansion**: The dataset is continuously expanding to support more static and dynamic indoor mobile robot tasks, such as robot navigation in complex crowded dynamic scenes, target tracking, trajectory prediction, etc. By sharing this dataset, the authors hope to promote and accelerate the development of new mobile robot algorithms to adapt to the dynamic environments in which robots actually operate.

Mobile Robot Oriented Large-Scale Indoor Dataset for Dynamic Scene Understanding

THUD++: Large-Scale Dynamic Indoor Scene Dataset and Benchmark for Mobile Robots

TO-Scene: A Large-scale Dataset for Understanding 3D Tabletop Scenes

HabitatDyn Dataset: Dynamic Object Detection to Kinematics Estimation

CID-SIMS: Complex indoor dataset with semantic information and multi-sensor data from a ground wheeled robot viewpoint

HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor Space Using Wearable IMUs and LiDAR

A construction method of a large-scale physical rendering 3D semantic segmentation dataset

The Robotic Vision Scene Understanding Challenge

Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving

RoboSense: Large-scale Dataset and Benchmark for Egocentric Robot Perception and Navigation in Crowded and Unstructured Environments

DIDLM:A Comprehensive Multi-Sensor Dataset with Infrared Cameras, Depth Cameras, LiDAR, and 4D Millimeter-Wave Radar in Challenging Scenarios for 3D Mapping

Digital Twin Tracking Dataset (DTTD): A New RGB+Depth 3D Dataset for Longer-Range Object Tracking Applications

A Real 3D Embodied Dataset for Robotic Active Visual Learning

THÖR-MAGNI: A Large-scale Indoor Motion Capture Recording of Human Movement and Robot Interaction

KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D

Indoor Obstacle Discovery on Reflective Ground via Monocular Camera

OCTScenes: A Versatile Real-World Dataset of Tabletop Scenes for Object-Centric Learning

MCD: Diverse Large-Scale Multi-Campus Dataset for Robot Perception

What can i do around here? Deep functional scene understanding for cognitive robots

Human-centric Scene Understanding for 3D Large-scale Scenarios