Abstract:Geospatial Location Embedding (GLE) helps a Large Language Model (LLM) assimilate and analyze spatial data. GLE emergence in Geospatial Artificial Intelligence (GeoAI) is precipitated by the need for deeper geospatial awareness in our complex contemporary spaces and the success of LLMs in extracting deep meaning in Generative AI. We searched Google Scholar, Science Direct, and arXiv for papers on geospatial location embedding and LLM and reviewed articles focused on gaining deeper spatial "knowing" through LLMs. We screened 304 titles, 30 abstracts, and 18 full-text papers that reveal four GLE themes - Entity Location Embedding (ELE), Document Location Embedding (DLE), Sequence Location Embedding (SLE), and Token Location Embedding (TLE). Synthesis is tabular and narrative, including a dialogic conversation between "Space" and "LLM." Though GLEs aid spatial understanding by superimposing spatial data, they emphasize the need to advance in the intricacies of spatial modalities and generalized reasoning. GLEs signal the need for a Spatial Foundation/Language Model (SLM) that embeds spatial knowing within the model architecture. The SLM framework advances Spatial Artificial Intelligence Systems (SPAIS), establishing a Spatial Vector Space (SVS) that maps to physical space. The resulting spatially imbued Language Model is unique. It simultaneously represents actual space and an AI-capable space, paving the way for AI native geo storage, analysis, and multi-modality as the basis for Spatial Artificial Intelligence Systems (SPAIS).

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve This paper aims to address the challenges faced by large language models (LLMs) in handling geospatial data, particularly how to effectively integrate geospatial location embeddings (GLE) into LLMs to enhance their understanding and analysis capabilities of geospatial information. Specifically, the paper explores the following key issues: 1. **Enhancement of Geospatial Knowledge**: - What improvements in geospatial understanding can be achieved using GLE methods? - How can GLE improve the performance of LLMs in the geospatial domain? 2. **Gap Between Geospatial and LLM**: - What gaps between geospatial and LLM are revealed by GLE methods? - How do these gaps affect the performance of LLMs in geospatial tasks? 3. **Building Spatial Language Models (SLM)**: - How to design a spatial language model (SLM) that natively supports geospatial intelligence? - How does SLM achieve intrinsic representation and reasoning of geospatial data? ### Background and Motivation As the complexity and importance of geospatial data continue to increase, traditional geographic information system (GIS) methods are no longer sufficient to meet the needs of modern applications. Large language models (LLMs) have shown excellent performance in natural language processing and generation tasks, but they have significant limitations in handling geospatial data. These limitations include vocabulary gaps, pattern mismatches, and inconsistencies in mathematical calculations. ### Methods and Results The paper systematically reviews the literature and analyzes four main GLE methods: 1. **Entity Location Embedding (ELE)**: Embedding geospatial entities (such as points of interest) into LLMs. 2. **Document Location Embedding (DLE)**: Embedding documents containing geospatial information into LLMs. 3. **Sequence Location Embedding (SLE)**: Embedding continuous geospatial features (such as city roads) into LLMs. 4. **Token Location Embedding (TLE)**: Embedding geospatial data (such as latitude and longitude coordinates) as tokens into LLMs. Through a comprehensive analysis of these methods, the paper reveals the following points: - **Successes**: GLE methods significantly improve the geospatial understanding capabilities of LLMs in specific domains, particularly in applications such as semantic queries, urban planning, and geocoding. - **Challenges and Gaps**: Despite some progress, these methods still face challenges in patterns, modalities, and mathematical calculations, making it difficult to achieve cross-domain generality. - **Future Directions**: To overcome these challenges, the paper proposes the concept of building spatial language models (SLM). SLM aligns geospatial patterns, modalities, and mathematical calculations with the vector space of LLMs to achieve intrinsic representation and reasoning of geospatial data. ### Conclusion The paper concludes that building a spatial language model (SLM) that natively supports geospatial intelligence is an important direction for future research. SLM can not only improve the performance of LLMs in geospatial tasks but also lay the foundation for developing more advanced spatial artificial intelligence systems (SPAIS). By standardizing spatial vector spaces (SVS) and implementing ethical frameworks, global cooperation and responsible geospatial AI applications can be promoted.

A systematic review of geospatial location embedding approaches in large language models: A path to spatial AI systems

GPT, large language models (LLMs) and generative artificial intelligence (GAI) models in geospatial science: a systematic review

Are Large Language Models Geospatially Knowledgeable?

Evaluating the Effectiveness of Large Language Models in Representing Textual Descriptions of Geometry and Spatial Relations

GeoLLM: Extracting Geospatial Knowledge from Large Language Models

A Review of Location Encoding for GeoAI: Methods and Applications

When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models

Geolocation Representation from Large Language Models are Generic Enhancers for Spatio-Temporal Learning

Can Large Language Models Create New Knowledge for Spatial Reasoning Tasks?

Enabling Geospatial Analysis for Public through Natural Language, with Large Language Models

Evaluating the Effectiveness of Large Language Models in Representing and Understanding Movement Trajectories

Is ChatGPT a Good Geospatial Data Analyst? Exploring the Integration of Natural Language into Structured Query Language within a Spatial Database

Can Large Language Models Generate Geospatial Code?

Editorial 2024: Large language models, artificial intelligence and geomorphology

Large Language Models are Geographically Biased

Core Building Blocks: Next Gen Geo Spatial GPT Application

Autonomous GIS: the next-generation AI-powered GIS

Inherent limitations of LLMs regarding spatial information

Large Language Models and Video Games: A Preliminary Scoping Review

The global landscape of academic guidelines for generative AI and Large Language Models

When Geoscience Meets Generative AI and Large Language Models: Foundations, Trends, and Future Challenges