Abstract:In this paper, a multi-modal data based semi-supervised learning (SSL) framework that jointly use channel state information (CSI) data and RGB images for vehicle positioning is designed. In particular, an outdoor positioning system where the vehicle locations are determined by a base station (BS) is considered. The BS equipped with several cameras can collect a large amount of unlabeled CSI data and a small number of labeled CSI data of vehicles, and the images taken by cameras. Although the collected images contain partial information of vehicles (i.e. azimuth angles of vehicles), the relationship between the unlabeled CSI data and its azimuth angle, and the distances between the BS and the vehicles captured by images are both unknown. Therefore, the images cannot be directly used as the labels of unlabeled CSI data to train a positioning model. To exploit unlabeled CSI data and images, a SSL framework that consists of a pretraining stage and a downstream training stage is proposed. In the pretraining stage, the azimuth angles obtained from the images are considered as the labels of unlabeled CSI data to pretrain the positioning model. In the downstream training stage, a small sized labeled dataset in which the accurate vehicle positions are considered as labels is used to retrain the model. Simulation results show that the proposed method can reduce the positioning error by up to 30% compared to a baseline where the model is not pretrained.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: in the outdoor environment, how to use multi - modal data (i.e., channel state information (CSI) data and RGB images) to improve the performance of vehicle location models. Specifically, the paper proposes a semi - supervised learning (SSL) framework, aiming to reduce the dependence on a large amount of labeled data and improve the accuracy of vehicle location by combining unlabeled CSI data and image data. ### Problem Background Currently, vehicle location methods based on the global navigation satellite system (GNSS) (such as GPS) have a significant performance degradation in urban environments due to the obstruction of buildings, pedestrians, and other vehicles. In order to improve the accuracy of these vehicle location methods, research can be carried out on using radio - frequency (RF) signals for vehicle location. Compared with GNSS, RF signals have lower latency and higher location accuracy, especially in non - line - of - sight (NLoS) scenarios. In addition, RF signals can also maintain robustness under harsh lighting and weather conditions. However, using RF signals for vehicle location still faces some challenges, such as high - precision location of high - speed moving targets, 3D signal propagation environment modeling, and combination with traditional GNSS location methods. ### Research Status Many existing works have studied how to use RF data for indoor and outdoor location. However, most of these methods require a large amount of labeled data (i.e., RF data and their corresponding locations) to train deep - learning (DL) models, which may be difficult to achieve in practical applications. In addition, although some works have studied the joint use of images and wireless data for location, they usually assume that the user location can be directly estimated from RF data without considering how to use multi - modal unlabeled data to generate labeled data for training DL models. ### Main Contributions of the Paper The main contribution of this paper lies in proposing a novel semi - supervised learning framework that can jointly use images and unlabeled CSI data to improve the performance of vehicle location models. Specifically: 1. **Semi - supervised Learning Framework**: A SSL framework including a pre - training stage and a downstream training stage is proposed. In the pre - training stage, the vehicle azimuth obtained from the image is used as a label to pre - train part of the location model; in the downstream training stage, a small - scale labeled data set (where the exact vehicle location is used as a label) is used to retrain the model. 2. **Azimuth Distribution Vector**: In order to solve the problem that the correspondence between unlabeled CSI data and vehicle azimuth is unknown, it is proposed to use a Gaussian filter to convert the azimuth of each vehicle into a vector representing its distribution in the angular domain. Then, the NOR model is used to combine all probability distribution vectors into one vector, representing the angular - domain distribution of all vehicles in each time slot. 3. **Pre - training Objective**: Given the image and CSI data, the pre - training objective is to predict the probability distribution vector of all vehicles in the angular domain in each time slot. By minimizing the error between the predicted distribution vector and the distribution vector obtained from the image, the model can learn to predict the vehicle location from CSI in a small - scale labeled data set. ### Experimental Results Simulation results show that, in the case of less labeled data, the proposed method can reduce the location error by up to 30% compared with the non - pre - trained baseline model. As far as the author knows, this is the first work to study the joint use of CSI data and camera images for vehicle location. ### Conclusion This paper successfully solves the problem of relying on a large amount of labeled data in vehicle location by introducing multi - modal data and semi - supervised learning methods, and significantly improves the location accuracy. This method provides new ideas and directions for future vehicle location technologies.

Multi-modal Data based Semi-Supervised Learning for Vehicle Positioning

Multi-modal Image and Radio Frequency Fusion for Optimizing Vehicle Positioning

Category-Level Regularized Unlabeled-to-Labeled Learning for Semi-supervised Prostate Segmentation with Multi-site Unlabeled Data

Multimodal Localization: Stereo over LiDAR Map

A robust stereo feature-aided semi-direct SLAM system

3D LiDAR-Based Global Localization Using Siamese Neural Network

Persistent Stereo Visual Localization on Cross-Modal Invariant Map

Augmenting Channel Simulator and Semi- Supervised Learning for Efficient Indoor Positioning

CSI of Each Subcarrier is a Fingerprint: Multi-Carrier Cumulative Learning Based Positioning in Massive MIMO Systems

Deep Learning-Based Multi-User Positioning in Wireless FDMA Cellular Networks

Monocular Vehicle Self-localization Method Based on Compact Semantic Map

Estimation of Vehicle Pose and Position with Monocular Camera at Urban Road Intersections

A Multi-Sensor Fusion Positioning Strategy for Intelligent Vehicles Using Global Pose Graph Optimization

Separated collaborative learning for semi-supervised prostate segmentation with multi-site heterogeneous unlabeled MRI data

CenterLoc3D: Monocular 3D Vehicle Localization Network for Roadside Surveillance Cameras

Dynamic WiFi indoor positioning based on the multi-scale metric learning

Multi-Sensor Multi-Vehicle (MSMV) Localization and Mobility Tracking for Autonomous Driving

Beyond Cross-view Image Retrieval: Highly Accurate Vehicle Localization Using Satellite Image

Index Your Position: A Novel Self-Supervised Learning Method for Remote Sensing Images Semantic Segmentation

Improved Multi-Sensor Fusion Positioning System Based on GNSS/LiDAR/Vision/IMU With Semi-Tight Coupling and Graph Optimization in GNSS Challenging Environments

A generic self-supervised learning (SSL) framework for representation learning from spectra-spatial feature of unlabeled remote sensing imagery