Climate Change from Large Language Models

Hongyin Zhu,Prayag Tiwari
2024-07-01
Abstract:Climate change poses grave challenges, demanding widespread understanding and low-carbon lifestyle awareness. Large language models (LLMs) offer a powerful tool to address this crisis, yet comprehensive evaluations of their climate-crisis knowledge are lacking. This paper proposes an automated evaluation framework to assess climate-crisis knowledge within LLMs. We adopt a hybrid approach for data acquisition, combining data synthesis and manual collection, to compile a diverse set of questions encompassing various aspects of climate change. Utilizing prompt engineering based on the compiled questions, we evaluate the model's knowledge by analyzing its generated answers. Furthermore, we introduce a comprehensive set of metrics to assess climate-crisis knowledge, encompassing indicators from 10 distinct perspectives. These metrics provide a multifaceted evaluation, enabling a nuanced understanding of the LLMs' climate crisis comprehension. The experimental results demonstrate the efficacy of our proposed method. In our evaluation utilizing diverse high-performing LLMs, we discovered that while LLMs possess considerable climate-related knowledge, there are shortcomings in terms of timeliness, indicating a need for continuous updating and refinement of their climate-related content.
Computation and Language,Computers and Society
What problem does this paper attempt to address?
The paper aims to address the shortcomings of large language models (LLMs) in assessing knowledge about the climate crisis. Specifically, the paper proposes an automated evaluation framework to assess the performance of LLMs in the domain of climate crisis knowledge. The main objectives include: 1. **Developing automated evaluation methods**: A method is proposed to extract and evaluate climate crisis knowledge from LLMs by designing prompts to guide LLMs in generating relevant knowledge and using a comprehensive set of metrics to assess the quality of this knowledge. 2. **Collecting high-quality questions and answers**: A hybrid approach (data synthesis and manual collection) is used to gather a large number of questions about the climate crisis, and high-performance LLMs are employed to automatically generate corresponding answers. 3. **Proposing an evaluation metric system**: Ten different dimensions of metrics are introduced, including five question evaluation metrics (importance, clarity, relevance, difficulty, and innovativeness) and five answer evaluation metrics (relevance, depth, readability, innovativeness, and timeliness) to achieve a multi-faceted comprehensive evaluation. 4. **Validating the method's effectiveness**: Experiments demonstrate that the proposed method can effectively evaluate the performance of LLMs in the domain of climate crisis knowledge, highlighting that while current LLMs possess rich knowledge, they suffer from issues such as lack of timeliness. In summary, this research is dedicated to enhancing the reliability and practicality of LLMs in disseminating climate crisis knowledge, laying the foundation for the development of more efficient climate crisis knowledge systems in the future.