Abstract:In this paper, a multi-modal data based semi-supervised learning (SSL) framework that jointly use channel state information (CSI) data and RGB images for vehicle positioning is designed. In particular, an outdoor positioning system where the vehicle locations are determined by a base station (BS) is considered. The BS equipped with several cameras can collect a large amount of unlabeled CSI data and a small number of labeled CSI data of vehicles, and the images taken by cameras. Although the collected images contain partial information of vehicles (i.e. azimuth angles of vehicles), the relationship between the unlabeled CSI data and its azimuth angle, and the distances between the BS and the vehicles captured by images are both unknown. Therefore, the images cannot be directly used as the labels of unlabeled CSI data to train a positioning model. To exploit unlabeled CSI data and images, a SSL framework that consists of a pretraining stage and a downstream training stage is proposed. In the pretraining stage, the azimuth angles obtained from the images are considered as the labels of unlabeled CSI data to pretrain the positioning model. In the downstream training stage, a small sized labeled dataset in which the accurate vehicle positions are considered as labels is used to retrain the model. Simulation results show that the proposed method can reduce the positioning error by up to 30% compared to a baseline where the model is not pretrained.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in the outdoor environment, how to use multi - modal data (i.e., channel state information (CSI) data and RGB images) to improve the performance of vehicle location models. Specifically, the paper proposes a semi - supervised learning (SSL) framework, aiming to reduce the dependence on a large amount of labeled data and improve the accuracy of vehicle location by combining unlabeled CSI data and image data.
### Problem Background
Currently, vehicle location methods based on the global navigation satellite system (GNSS) (such as GPS) have a significant performance degradation in urban environments due to the obstruction of buildings, pedestrians, and other vehicles. In order to improve the accuracy of these vehicle location methods, research can be carried out on using radio - frequency (RF) signals for vehicle location. Compared with GNSS, RF signals have lower latency and higher location accuracy, especially in non - line - of - sight (NLoS) scenarios. In addition, RF signals can also maintain robustness under harsh lighting and weather conditions. However, using RF signals for vehicle location still faces some challenges, such as high - precision location of high - speed moving targets, 3D signal propagation environment modeling, and combination with traditional GNSS location methods.
### Research Status
Many existing works have studied how to use RF data for indoor and outdoor location. However, most of these methods require a large amount of labeled data (i.e., RF data and their corresponding locations) to train deep - learning (DL) models, which may be difficult to achieve in practical applications. In addition, although some works have studied the joint use of images and wireless data for location, they usually assume that the user location can be directly estimated from RF data without considering how to use multi - modal unlabeled data to generate labeled data for training DL models.
### Main Contributions of the Paper
The main contribution of this paper lies in proposing a novel semi - supervised learning framework that can jointly use images and unlabeled CSI data to improve the performance of vehicle location models. Specifically:
1. **Semi - supervised Learning Framework**: A SSL framework including a pre - training stage and a downstream training stage is proposed. In the pre - training stage, the vehicle azimuth obtained from the image is used as a label to pre - train part of the location model; in the downstream training stage, a small - scale labeled data set (where the exact vehicle location is used as a label) is used to retrain the model.
2. **Azimuth Distribution Vector**: In order to solve the problem that the correspondence between unlabeled CSI data and vehicle azimuth is unknown, it is proposed to use a Gaussian filter to convert the azimuth of each vehicle into a vector representing its distribution in the angular domain. Then, the NOR model is used to combine all probability distribution vectors into one vector, representing the angular - domain distribution of all vehicles in each time slot.
3. **Pre - training Objective**: Given the image and CSI data, the pre - training objective is to predict the probability distribution vector of all vehicles in the angular domain in each time slot. By minimizing the error between the predicted distribution vector and the distribution vector obtained from the image, the model can learn to predict the vehicle location from CSI in a small - scale labeled data set.
### Experimental Results
Simulation results show that, in the case of less labeled data, the proposed method can reduce the location error by up to 30% compared with the non - pre - trained baseline model. As far as the author knows, this is the first work to study the joint use of CSI data and camera images for vehicle location.
### Conclusion
This paper successfully solves the problem of relying on a large amount of labeled data in vehicle location by introducing multi - modal data and semi - supervised learning methods, and significantly improves the location accuracy. This method provides new ideas and directions for future vehicle location technologies.