GPT, large language models (LLMs) and generative artificial intelligence (GAI) models in geospatial science: a systematic review

Siqin Wang,Tao Hu,Huang Xiao,Yun Li,Ce Zhang,Huan Ning,Rui Zhu,Zhenlong Li,Xinyue Ye
DOI: https://doi.org/10.1080/17538947.2024.2353122
IF: 4.606
2024-05-22
International Journal of Digital Earth
Abstract:The launch of large language models (LLMs) like ChatGPT in late 2022 and the anticipated arrival of future GPT-x iterations have marked the beginning of the generative artificial intelligence (GAI) era. We conducted a systematic review of how to integrate LLMs including GPT and other GAI models into geospatial science, based on 293 papers obtained from four databases of academic publications – Web of Science (WoS), Scopus, SSRN and arXiv – 26 papers were eventually included for analysis. We statistically outlined the share of domains where LLMs and other GAI models, the type of data that have been used for these models, and the modelling tasks and roles that they play. We also pointed out the challenges and future directions for the next research agenda – along with which we could better position ourselves in the mainstream of science and the cutting-edge research paradigm as others leverage insights from the growing data deluge.
geography, physical,remote sensing
What problem does this paper attempt to address?
### Problems the Paper Aims to Address This paper aims to systematically review the applications of large language models (LLMs) such as GPT and their generative artificial intelligence (GAI) models in geospatial science. Specifically, the paper seeks to address the following issues: 1. **Review of Current Research**: - Review the application of LLMs and GAI models in the field of geospatial science since the launch of ChatGPT in November 2022. - Statistically analyze the proportion of these models' applications in different fields, the types of data used, and their roles in modeling tasks. 2. **Challenges and Future Directions**: - Identify current research challenges, such as model reliability, response consistency issues, and the impact of the black-box nature of models on predictive tasks. - Propose future research directions to better integrate LLMs and GAI models into geospatial science, thereby advancing mainstream scientific development and cutting-edge research paradigms in this field. 3. **Construction of an Information Gain Framework**: - Construct an information gain-based framework to classify and understand the different roles of GAI models in geospatial science, including purifier, converter, generator, and reasoner. - Through this framework, help researchers better trace the expected information sources in GAI outputs, thereby improving the effectiveness of model applications. ### Research Methods The paper adopts the standard systematic review method, namely the PRISMA method, collecting 293 relevant papers from four academic databases (Web of Science, Scopus, SSRN, and arXiv). After manual screening, 26 papers were selected for detailed analysis. The research methods include: - **Data Collection**: Defined preset search syntax covering keywords related to large pre-trained models and geospatial science. - **Paper Screening**: By removing duplicates and excluding papers not in the field of geospatial science, 26 eligible papers were finally determined. - **Statistical Analysis**: Conducted statistical analysis on the selected papers, including the fields of model application, types of data used, modeling tasks, etc. ### Main Findings - **Application Fields**: LLMs and GAI models are mainly applied in cartography (23%), geological engineering (19%), spatial analysis and querying (15%), informatics (12%), and transportation (12%). - **Data Types**: Text data is the most commonly used data type (55%), followed by raster data (21%), vector data (12%), spatiotemporal data (9%), and video (3%). - **Model Roles**: GAI models play various roles in geospatial science, including purifier, converter, generator, and reasoner. Each role has its specific information gain characteristics, helping researchers better understand and apply these models. ### Future Directions - **Enhancing Geospatial Data Query and Recommendation**: Utilize the advanced language understanding capabilities of LLMs to develop LLM-based geospatial data recommendation engines, improving the efficiency of data discovery and analysis. - **Building Robust Evaluation Frameworks**: Establish dedicated evaluation frameworks for the application of LLMs and GAI models in geospatial science to ensure model reliability and consistency, reducing the occurrence of errors and unreliable results. Through these efforts, the paper hopes to provide valuable references for researchers in the field of geospatial science, promoting further development in this area.