Research on Indoor 3D Reconstruction Technology Based on Semantic Visual Simultaneous Localization and Mapping

Yu Liang,Cao Lijia,Fu Changyou
DOI: https://doi.org/10.31763/ijrcs.v4i1.1266
2024-02-12
International Journal of Robotics and Control Systems
Abstract:In response to the challenge that traditional visual simultaneous localization and mapping (SLAM) systems, based on the assumption of a static environment, struggle to achieve real-time indoor 3D reconstruction in complex dynamic scenes, this paper proposes a real-time indoor 3D reconstruction algorithm based on semantic visual SLAM. By leveraging object detection to obtain 2D semantic information and providing prior information for geometric methods, the fusion of the two effectively suppresses dynamic features, reduces reliance on deep learning methods, and ensures the algorithm's real-time performance. Experimental results on dynamic scenes in the TUM RGB-D dataset show that our algorithm maintains nearly unchanged real-time performance while achieving an average performance improvement of approximately 97.56% and 97.31% on the TUM dataset and Bonn dataset, respectively, compared to the ORB-SLAM2 system. Moreover, our algorithm can reconstruct more intuitive indoor global Octo-map and semantic metric maps compared to sparse point cloud maps, effectively enhancing the scene perception capability of mobile robots and laying the foundation for performing advanced tasks. Furthermore, our algorithm demonstrates a 3.5-10.5 times improvement in real-time performance compared to other mainstream semantic SLAM systems. Experimental results on the NVIDIA Jetson AGX Xavier confirm that our algorithm can run in real time on low-power platforms such as mobile robots or drones. However, the drawbacks of our algorithm include lower reconstruction accuracy in low-texture and large-scale scenes and ineffective suppression of dynamic features in low-dynamic scenes. Future work will consider replacing and improving deep learning methods and integrating IMU and other sensors to enhance system usability.
What problem does this paper attempt to address?