Exploring the Feasibility of Automated Data Standardization using Large Language Models for Seamless Positioning

Max J. L. Lee,Ju Lin,Li-Ta Hsu
2024-08-22
Abstract:We propose a feasibility study for real-time automated data standardization leveraging Large Language Models (LLMs) to enhance seamless positioning systems in IoT environments. By integrating and standardizing heterogeneous sensor data from smartphones, IoT devices, and dedicated systems such as Ultra-Wideband (UWB), our study ensures data compatibility and improves positioning accuracy using the Extended Kalman Filter (EKF). The core components include the Intelligent Data Standardization Module (IDSM), which employs a fine-tuned LLM to convert varied sensor data into a standardized format, and the Transformation Rule Generation Module (TRGM), which automates the creation of transformation rules and scripts for ongoing data standardization. Evaluated in real-time environments, our study demonstrates adaptability and scalability, enhancing operational efficiency and accuracy in seamless navigation. This study underscores the potential of advanced LLMs in overcoming sensor data integration complexities, paving the way for more scalable and precise IoT navigation solutions.
Signal Processing,Artificial Intelligence,Networking and Internet Architecture
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in the Internet of Things (IoT) environment, how to achieve automated data standardization by using large - language models (LLMs) to improve the accuracy and compatibility of seamless positioning systems. Specifically, the paper aims to solve the following problems: 1. **Integration and Standardization of Heterogeneous Sensor Data**: - Different types of sensors (such as smart phones, IoT devices, UWB tags, etc.) produce data in different formats and units, making it difficult to directly fuse and process the data. - Traditional methods rely on manual feature engineering and domain - specific knowledge, which limits the scalability and adaptability of the system. 2. **Improving the Accuracy of Seamless Positioning Systems**: - A single positioning technology (such as GNSS, UWB, inertial sensors, etc.) has its own advantages and disadvantages and cannot meet the high - precision positioning requirements alone. - By fusing multiple sensor data and applying the Extended Kalman Filter (EKF), the accuracy and robustness of the positioning system can be improved. 3. **Achieving Real - Time Automated Data Standardization**: - An intelligent data standardization module (IDSM) based on LLMs is proposed, which can automatically convert sensor data in different formats into a standardized format. - Automatically generate conversion rules and scripts, reduce manual intervention, and improve data processing efficiency. 4. **Verifying the Feasibility and Effectiveness of the System in the Actual Environment**: - Conduct experiments in a dynamic environment, evaluate the performance of the system in indoor - outdoor transition scenarios, and verify its adaptability and scalability. ### Core Contributions of the Paper - **Innovative Application of LLMs**: For the first time, LLMs are applied to the automated standardization of heterogeneous sensor data, expanding the application range of LLMs. - **Enhanced Scalability and Adaptability**: Automated data standardization reduces the dependence on manual feature engineering and domain - specific knowledge, improving the scalability and adaptability of the system. - **Improved Positioning Accuracy**: By fusing multi - source sensor data after standardization and combining with the EKF algorithm, the accuracy of the seamless positioning system is significantly improved. ### Experimental Results The experimental results show that the combined method of fusing multiple positioning technologies (GNSS + VPS + UWB + IMU) achieves the best positioning effect, with an average error of only 0.33 meters, verifying the effectiveness of this method. ### Conclusion This research shows the potential of using LLMs to achieve automated data standardization in seamless positioning systems, especially in terms of improving data compatibility and positioning accuracy. Although certain results have been achieved, future research still needs to further optimize the adaptability and robustness of the system and explore the application of more emerging technologies.