Abstract:For interacting with mobile objects in unfamiliar environments, simultaneously locating, mapping, and tracking the 3D poses of multiple objects are crucially required. This paper proposes a Tracklet Graph and Query Graph-based framework, i.e., GSLAMOT, to address this challenge. GSLAMOT utilizes camera and LiDAR multimodal information as inputs and divides the representation of the dynamic scene into a semantic map for representing the static environment, a trajectory of the ego-agent, and an online maintained Tracklet Graph (TG) for tracking and predicting the 3D poses of the detected mobile objects. A Query Graph (QG) is constructed in each frame by object detection to query and update TG. For accurate object association, a Multi-criteria Star Graph Association (MSGA) method is proposed to find matched objects between the detections in QG and the predicted tracklets in TG. Then, an Object-centric Graph Optimization (OGO) method is proposed to simultaneously optimize the TG, the semantic map, and the agent trajectory. It triangulates the detected objects into the map to enrich the map's semantic information. We address the efficiency issues to handle the three tightly coupled tasks in parallel. Experiments are conducted on KITTI, Waymo, and an emulated Traffic Congestion dataset that highlights challenging scenarios. Experiments show that GSLAMOT enables accurate crowded object tracking while conducting SLAM accurately in challenging scenarios, demonstrating more excellent performances than the state-of-the-art methods. The code and dataset are at <a class="link-external link-https" href="https://gslamot.github.io" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to solve the problem of simultaneously performing self - localization, mapping, and 3D pose tracking of multiple dynamic objects (SLAM and 3D MOT) in an unfamiliar environment. Specifically, the paper proposes a framework named GSLAMOT to deal with the following challenges: 1. **Concurrent execution of SLAM and 3D MOT**: - SLAM needs to rely on object detection to eliminate the influence of dynamic objects on accurate tracking and static map construction. - 3D MOT depends on the accurate ego - vehicle pose to triangulate the 3D pose of moving objects. 2. **Concurrent movement of the ego - vehicle and dynamic objects**: - The simultaneous movement of the ego - vehicle and surrounding objects increases the difficulty of localization and tracking. 3. **Errors caused by object detection algorithms and sensor noise**: - Errors in object detection algorithms and sensor noise will further affect the localization accuracy of surrounding objects. 4. **Matching difficulties introduced by occlusion and high - speed movement**: - Occlusion and high - speed movement make object matching more complex, resulting in localization and tracking errors. To solve these problems, GSLAMOT proposes the following innovations: - **Tracklet Graph (TG) and Query Graph (QG)**: They are respectively used to represent and update the trajectories of detected moving objects and new objects detected in each frame. - **Multi - criteria Star Graph Association (MSGA)**: A new multi - criteria star graph association method for robustly matching QG and TG to meet the matching challenges in dynamic, crowded and noisy environments. - **Object - centric Graph Optimization (OGO)**: A proposed object - centric graph optimization method that simultaneously optimizes TG, the semantic map and the ego - vehicle trajectory. - **Parallel thread implementation**: To achieve real - time performance, the system adopts multi - thread parallel computing, enabling localization, 3D MOT and semantic mapping to run efficiently and concurrently. Through these methods, GSLAMOT achieves accurate self - localization, mapping and multi - target tracking in complex scenes with a large number of dynamic objects, and its performance is better than the existing OOSLAM system.

GSLAMOT: A Tracklet and Query Graph-based Simultaneous Locating, Mapping, and Multiple Object Tracking System

Exploit the Connectivity: Multi-Object Tracking with TrackletNet

Exploit the Connectivity

LIMOT: A Tightly-Coupled System for LiDAR-Inertial Odometry and Multi-Object Tracking

DMOT-SLAM: Visual SLAM in Dynamic Environments with Moving Object Tracking

LiDAR SLAMMOT based on Confidence-guided Data Association

Visual SLAMMOT Considering Multiple Motion Models

CAMO-MOT: Combined Appearance-Motion Optimization for 3D Multi-Object Tracking With Camera-LiDAR Fusion

OTE-SLAM: An Object Tracking Enhanced Visual SLAM System for Dynamic Environments

Semantic geometric fusion multi-object tracking and lidar odometry in dynamic environment

Multi-Granularity Language-Guided Multi-Object Tracking

DOT-SLAM: A Stereo Visual Simultaneous Localization and Mapping (SLAM) System with Dynamic Object Tracking Based on Graph Optimization

CTO-SLAM: Contour Tracking for Object-Level Robust 4D SLAM

A Visual SLAM With Tightly-Coupled Integration of Multi-Object Tracking for Production Workshop

DL-SLOT: Dynamic LiDAR SLAM and object tracking based on collaborative graph optimization

Towards Real-Time Multi-Object Tracking

OMS-SLAM: Dynamic Scene Visual SLAM Based on Object Detection with Multiple Geometric Feature Constraints and Statistical Threshold Segmentation

A semantic visual SLAM towards object selection and tracking optimization

Spatial-Semantic and Temporal Attention Mechanism-Based Online Multi-Object Tracking

TRLO: An Efficient LiDAR Odometry with 3D Dynamic Object Tracking and Removal

LaMOT: Language-Guided Multi-Object Tracking