What problem does this paper attempt to address?

The problems that this paper attempts to solve are: in the field of climate action, the common errors and risks that exist when generative AI systems (especially large language models LLMs and Generative Pretrained Transformer GPT) are used for knowledge extraction. Specifically, the paper focuses on the following three main issues: 1. **Incomplete answers**: - The paper points out that most of the existing knowledge extraction work focuses on tasks of qualitative information, such as answering questions about the topic of a document. However, extracting objective data (i.e., factual data based on specific indicators and target values) from climate - related texts remains an under - explored and challenging area. - The author demonstrates this problem through a simple test, asking several chat systems to list all the key performance indicators (KPIs) mentioned in the latest IPCC AR6 report. The results show that these systems are unable to provide complete and accurate answers, which highlights the limitations of existing models in handling specific fact - based queries. 2. **Hallucinations**: - The paper discusses the hallucination problem in generative AI systems when processing scientific data. Hallucination refers to the content generated by the model that either contradicts the source content (internal hallucination) or cannot be verified from the source content (external hallucination). - For example, the Meta Galactica system was shut down shortly after being made public because it generated false scientific articles, associated real authors with fictional papers, and included false statements in fabricated Wikipedia articles. - The author cites an example of ChatGPT, in which there are multiple incorrect or irrelevant links in the provided references, further illustrating the seriousness of this hallucination problem. 3. **Misinformation**: - In the context of climate - related data sharing, the paper differentiates between credible data and misinformation. Credible data is based on accurate facts and evidence, while misinformation is wrong or inaccurate data that may cause people to misunderstand the facts of climate change and then make harmful decisions. - If users rely on currently error - prone LLM technology, they may be misled, which will affect their trust in authoritative institutions and raise issues of liability and legal responsibility. In addition, there is a large amount of intentionally or unintentionally wrong and outdated climate change information on the Internet, and this information has also been incorporated into the training data, increasing the risk of errors in the output of generative AI models. Overall, the paper aims to emphasize that when using generative AI systems for knowledge extraction in the field of climate action, the above - mentioned problems must be carefully addressed to ensure that the generated content is accurate and reliable and to avoid potential social harm.

Common errors in Generative AI systems used for knowledge extraction in the climate action domain

Generative AI tools can enhance climate literacy but must be checked for biases and inaccuracies

Exploring Large Language Models for Climate Forecasting

Responsible Retrieval Augmented Generation for Climate Decision Making from Documents

Climate Change from Large Language Models

ClimateGPT: Towards AI Synthesizing Interdisciplinary Research on Climate Change

Enhancing Large Language Models with Climate Resources

What executives need to know about knowledge management, large language models and generative AI

Assessing Large Language Models on Climate Information

When geoscience meets generative AI and large language models: Foundations, trends, and future challenges

Ten simple rules for using large language models in science, version 1.0

Assessing the Effectiveness of GPT-4o in Climate Change Evidence Synthesis and Systematic Assessments: Preliminary Insights

Challenges and Contributing Factors in the Utilization of Large Language Models (LLMs)

Toward a long-range map of human chromosomal band 22q11.

Using Large Language Models for the Interpretation of Building Regulations

Towards unearthing neglected climate innovations from scientific literature using Large Language Models

Exploring the potential of large language models and generative artificial intelligence (GPT): Applications in Library and Information Science

Quo Vadis ChatGPT? From Large Language Models to Large Knowledge Models

The rise of generative artificial intelligence (AI) language models - challenges and opportunities for geographical and environmental education

Assessing Generative Language Models in Classification Tasks: Performance and Self-Evaluation Capabilities in the Environmental and Climate Change Domain