FLsM: Fuzzy Localization of Image Scenes Based on Large Models

Weiyi Chen,Lingjuan Miao,Jinchao Gui,Yuhao Wang,Yiran Li
DOI: https://doi.org/10.3390/electronics13112106
IF: 2.9
2024-05-29
Electronics
Abstract:This article primarily focuses on the study of image-based localization technology. While traditional methods have made significant advancements in technology and applications, the emerging field of visual image-based localization technology demonstrates tremendous potential for research. Deep learning has exhibited a strong performance in image processing, particularly in developing visual navigation and localization techniques using large-scale visual models. This paper introduces a sophisticated scene image localization technique based on large models in a vast spatial sample environment. The study involved training convolutional neural networks using millions of geographically labeled images, extracting image position information using large model algorithms, and collecting sample data under various conditions in elastic scene space. Through visual computation, the shooting position of photos was inferred to obtain the approximate position information of users. This method utilizes geographic location information to classify images and combines it with landmarks, natural features, and architectural styles to determine their locations. The experimental results show variations in positioning accuracy among different models, with the most optimal model obtained through training on a large-scale dataset. They also indicate that the positioning error in urban street-based images is relatively small, whereas the positioning effect in outdoor and local scenes, especially in large-scale spatial environments, is limited. This suggests that the location information of users can be effectively determined through the utilization of geographic data, to classify images and incorporate landmarks, natural features, and architectural styles. The study's experimentation indicates the variation in positioning accuracy among different models, highlighting the significance of training on a large-scale dataset for optimal results. Furthermore, it highlights the contrasting impact on urban street-based images versus outdoor and local scenes in large-scale spatial environments.
engineering, electrical & electronic,physics, applied,computer science, information systems
What problem does this paper attempt to address?
The paper primarily focuses on the research of image localization technology and proposes a fuzzy localization method for image scenes based on large models (FLsM). This method aims to address the problem of achieving visual localization in the absence of or inability to use satellite navigation signals. Specifically, the study attempts to solve the following key issues: 1. **Achieving Fuzzy Image Localization**: The paper proposes a technique for providing geographic range localization based on the extraction of semantic information from the visual image itself, known as visual fuzzy localization. This method can provide localization information when satellite navigation is unavailable. 2. **Constructing an Elastic Scale Space Environment**: To meet the localization needs of different scale scenarios, the paper introduces the concept of an "elastic scale space." This space covers variations from large-scale to fine-grained scales, emphasizing the variability and uncertainty of the environment, and allows localization through the information contained in the image itself. 3. **Utilizing Large-Scale Datasets to Train Models**: The paper uses a large number of geo-tagged images to train convolutional neural networks and employs large model algorithms to extract image location information. This helps improve the efficiency and accuracy of image matching, especially in fuzzy localization tasks. 4. **Improving Localization Accuracy**: By comparing different models, the study finds that the scale of the training dataset is crucial for optimizing localization results. Additionally, the paper discusses the differences in localization performance in various types of scenarios (such as urban streets, outdoor, and local scenes). In summary, the main goal of this research is to explore a lightweight visual localization method in environments where satellite navigation systems are limited or fail, to meet the needs of various scenarios and applications. By combining the advantages of deep learning technology and geographic information systems, the proposed solution in the paper can achieve reliable image localization in complex and variable environments.