GeoReasoner: Reasoning On Geospatially Grounded Context For Natural Language Understanding

Yibo Yan,Joey Lee
2024-08-21
Abstract:In human reading and communication, individuals tend to engage in geospatial reasoning, which involves recognizing geographic entities and making informed inferences about their interrelationships. To mimic such cognitive process, current methods either utilize conventional natural language understanding toolkits, or directly apply models pretrained on geo-related natural language corpora. However, these methods face two significant challenges: i) they do not generalize well to unseen geospatial scenarios, and ii) they overlook the importance of integrating geospatial context from geographical databases with linguistic information from the Internet. To handle these challenges, we propose GeoReasoner, a language model capable of reasoning on geospatially grounded natural language. Specifically, it first leverages Large Language Models (LLMs) to generate a comprehensive location description based on linguistic and geospatial information. It also encodes direction and distance information into spatial embedding via treating them as pseudo-sentences. Consequently, the model is trained on both anchor-level and neighbor-level inputs to learn geo-entity representation. Extensive experimental results demonstrate GeoReasoner's superiority in three tasks: toponym recognition, toponym linking, and geo-entity typing, compared to the state-of-the-art baselines.
Computation and Language,Machine Learning
What problem does this paper attempt to address?
The paper aims to address two main challenges in geospatial natural language understanding: 1. **Lack of generalization ability**: Existing methods perform poorly when dealing with unseen geospatial scenarios. 2. **Insufficient information fusion**: Existing methods fail to effectively integrate geospatial context from geographic databases with linguistic information from the internet. To address these issues, the paper proposes the GeoReasoner framework, a language model capable of reasoning about natural language based on geospatial context. Specifically, GeoReasoner leverages large language models to generate comprehensive location descriptions and enhances its representation capabilities by encoding directional and distance information as spatial embeddings in pseudo-sentences. Experimental results show that GeoReasoner outperforms current benchmark models in three tasks: toponym recognition, toponym linking, and geographic entity classification.