Abstract:Sixth-generation (6G) wireless systems, when ultimately deployed, will comprise intelligent wireless networks that provide high-accuracy localization services together with ubiquitous communication. By bringing in a fresh set of traits and functionalities that allow location and communication to coexist while sharing resources, they provide the impetus for this change. By identifying the critical technological enablers that open up exciting new possibilities for combined localization and sensing applications, we concentrate on converged 6G communication, localisation, and sensing systems. 6G will advance toward even higher frequency ranges, broader bandwidths, and massive antenna arrays in terms of potential enabling technologies. Owing to the drawbacks of LiDAR, including its high price, short lifespan, and large volume, visual sensors—inexpensive and lightweight—are garnering increased interest and developing into a hotspot for study. With the rapid advancements in deep learning (DL) and hardware computing capacity, new approaches and concepts for solving visual simultaneous localization and mapping (VSLAM) difficulties have surfaced. We concentrate on the visual odometry (VO) application of DL and VSLAM integration. Most VO algorithms used today, such as those for motion estimation, feature extraction, feature matching, local optimization, etc., are created using subpar pipelines. Using Convolution LSTM, a unique end-to-end design for monocular VO is presented in this research. It does not adopt any module in the traditional VO pipeline, instead inferring postures directly from a series of raw RGB photos (videos) because it has been trained and deployed end-to-end. It uses CNN to automatically train an adequate representation of features for the VO problem based on the Convolution LSTM, which is utilized to simulate sequential dynamics and relations implicitly. Comprehensive tests on the KITTI VO dataset demonstrate competitive performance compared to cutting-edge techniques. This confirms that the end-to-end DL approach can be a viable addition to conventional VO systems.

Global-Context-Aware Visual Odometry System With Epipolar-Geometry-Constrained Loss Function

Self-supervised Visual-LiDAR Odometry with Flip Consistency

Design of an Enhanced Visual Odometry by Building and Matching Compressive Panoramic Landmarks Online

CodeVIO: Visual-Inertial Odometry with Learned Optimizable Dense Depth

PALVO: Visual Odometry Based on Panoramic Annular Lens.

Spatio-temporal and geometry constrained network for automobile visual odometry

Beyond Learning: Back to Geometric Essence of Visual Odometry via Fusion-Based Paradigm

Learning Generalized Visual Odometry Using Position-Aware Optical Flow and Geometric Bundle Adjustment

Salient Sparse Visual Odometry With Pose-Only Supervision

PVO: Panoptic Visual Odometry.

Deep Visual Odometry with Adaptive Memory

MCVO: A Generic Visual Odometry for Arbitrarily Arranged Multi-Cameras

DF-VO: What Should Be Learnt for Visual Odometry?

GSL-VO: A Geometric-Semantic Information Enhanced Lightweight Visual Odometry in Dynamic Environments

Learning‐based monocular visual‐inertial odometry with SE2(3) ‐EKF

OCC-VO: Dense Mapping via 3D Occupancy-Based Visual Odometry for Autonomous Driving

XVO: Generalized Visual Odometry via Cross-Modal Self-Training

CVLNet: Cross-View Semantic Correspondence Learning for Video-based Camera Localization

Self-Supervised Deep Visual Odometry Based on Geometric Attention Model

SLAM Visual Localization and Location Recognition Technology Based on 6G Network

Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning