Visual SLAM Method Based on Motion Segmentation

Qingzi Chen,Lingyun Zhu,Jirui Liu
DOI: https://doi.org/10.1109/iccea62105.2024.10604177
2024-01-01
Abstract:Traditional simultaneous localization and mapping (SLAM) algorithms are prone to interference from dynamic objects in real-world scenarios, resulting in poor algorithm robustness and low localization accuracy. In this paper, a visual SLAM method based on motion segmentation is proposed, building upon the ORB-SLAM3 framework. First, introducing a motion segmentation method Rigidmask to detect potential dynamic objects and generate dynamic object mask images, which combines geometric constraints, optical flow estimation, and depth estimation techniques. Meanwhile, YOLO-World is employed for instance segmentation to obtain object mask images. Subsequently, correspondence matching between the two types of mask images is conducted to enhance motion segmentation accuracy. Finally, the dynamic feature points are eliminated, and the remaining feature points are used for pose matching and estimation. Experimental results on the TUM data set show that compared with ORBSLAM3, under highly dynamic sequences, the absolute trajectory error (ATE) of this method is increased by more than 89%. At the same time, compared with some current mainstream visual SLAM algorithms in dynamic scenes, the positioning accuracy has also been improved.
What problem does this paper attempt to address?