Vision-Aided Beam Tracking: Explore the Proper Use of Camera Images with Deep Learning

Yu Tian,Chenwei Wang
DOI: https://doi.org/10.48550/arXiv.2109.14686
2021-09-30
Abstract:We investigate the problem of wireless beam tracking on mmWave bands with the assistance of camera images. In particular, based on the user's beam indices used and camera images taken in the trajectory, we predict the optimal beam indices in the next few time spots. To resolve this problem, we first reformulate the "ViWi" dataset in [1] to get rid of the image repetition problem. Then we develop a deep learning approach and investigate various model components to achieve the best performance. Finally, we explore whether, when, and how to use the image for better beam prediction. To answer this question, we split the dataset into three clusters -- (LOS, light NLOS, serious NLOS)-like -- based on the standard deviation of the beam sequence. With experiments we demonstrate that using the image indeed helps beam tracking especially when the user is in serious NLOS, and the solution relies on carefully-designed dataset for training a model. Generally speaking, including NLOS-like data for training a model does not benefit beam tracking of the user in LOS, but including light NLOS-like data for training a model benefits beam tracking of the user in serious NLOS.
Machine Learning,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to use camera images to improve the accuracy of beam prediction in millimeter - wave - band wireless beam tracking. Specifically, the author predicts the optimal beam index at several future time points through the beam index used by the user and the camera images in the trajectory. To solve this problem, the author improves the existing "ViWi" dataset to eliminate the problem of image duplication, and develops a deep - learning method to explore how to better use image information for beam prediction under different circumstances. ### Main Research Questions 1. **How to use camera images to assist beam tracking in the millimeter - wave band**: The author hopes to explore whether using images can improve the accuracy of beam prediction, especially in non - line - of - sight (NLOS) environments. 2. **How to design an appropriate dataset**: To ensure the effectiveness of the model, the author reconstructs the training and validation datasets so that the images in these datasets are different from each other, thus avoiding the model performing well because it memorizes the images in the training set. 3. **When and how to use image information**: The author divides the dataset into three clusters (LOS, mild NLOS, severe NLOS) based on the standard deviation of the beam sequence to explore the effect of using images in different scenarios. ### Method Overview - **Dataset Improvement**: In the original "ViWi" dataset, the training set and the validation set share almost all images, which may lead to model over - fitting. The author reconstructs the dataset so that the images in the training set and the validation set are completely independent. - **Deep - Learning Model**: The author proposes a deep - learning model consisting of four components: beam embedding, object recognition for feature extraction, feature embedding for further dimension reduction, and a sequence model for prediction. Model components include CNN, AE (auto - encoder), PCA, etc. - **Experiment and Evaluation**: Through comparison experiments between multiple baseline models and the proposed scheme, the performance of different methods in different scenarios is evaluated. ### Main Conclusions - **The Importance of Images in Severe NLOS Environments**: Experiments show that in severe NLOS environments, using images does help beam tracking, especially when the user is in a complex environment. - **The Importance of Dataset Design**: A carefully designed dataset is crucial for improving model performance, especially when dealing with NLOS situations. - **The Influence of Model Complexity**: As the model becomes more and more complex, the marginal benefit brought by images gradually decreases, so it is necessary to balance the relationship between model complexity and performance improvement. In conclusion, through improving the dataset and developing a deep - learning model, this paper deeply explores how to effectively use camera images in beam tracking in the millimeter - wave band, especially in non - line - of - sight environments.