Leveraging Large Language Models with Chain-of-Thought and Prompt Engineering for Traffic Crash Severity Analysis and Inference

Hao Zhen,Yucheng Shi,Yongcan Huang,Jidong J. Yang,Ninghao Liu
2024-08-05
Abstract:Harnessing the power of Large Language Models (LLMs), this study explores the use of three state-of-the-art LLMs, specifically GPT-3.5-turbo, LLaMA3-8B, and LLaMA3-70B, for crash severity inference, framing it as a classification task. We generate textual narratives from original traffic crash tabular data using a pre-built template infused with domain knowledge. Additionally, we incorporated Chain-of-Thought (CoT) reasoning to guide the LLMs in analyzing the crash causes and then inferring the severity. This study also examine the impact of prompt engineering specifically designed for crash severity inference. The LLMs were tasked with crash severity inference to: (1) evaluate the models' capabilities in crash severity analysis, (2) assess the effectiveness of CoT and domain-informed prompt engineering, and (3) examine the reasoning abilities with the CoT framework. Our results showed that LLaMA3-70B consistently outperformed the other models, particularly in zero-shot settings. The CoT and Prompt Engineering techniques significantly enhanced performance, improving logical reasoning and addressing alignment issues. Notably, the CoT offers valuable insights into LLMs' reasoning processes, unleashing their capacity to consider diverse factors such as environmental conditions, driver behavior, and vehicle characteristics in severity analysis and inference.
Computation and Language,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to utilize the capabilities of large language models (LLMs) in the analysis and inference of traffic accident severity. Specifically, the paper explores the applications of three state - of - the - art LLMs (namely GPT - 3.5 - turbo, LLaMA3 - 8B, and LLaMA3 - 70B) in inferring traffic accident severity, treating it as a classification task. The main objectives of the paper include: 1. **Evaluating the capabilities of LLMs in traffic accident severity analysis**: By converting the original traffic accident tabular data into text narratives containing domain knowledge and using these text narratives as input, test the ability of LLMs to classify accident severity without additional training (zero - sample learning). 2. **Evaluating the effectiveness of Chain - of - Thought (CoT) and domain - informed Prompt Engineering (PE)**: Study how CoT and PE affect the performance of LLMs in traffic accident severity analysis, especially in zero - sample and few - sample learning scenarios. CoT aims to guide LLMs to analyze accident causes and infer severity through a structured reasoning process, while PE enhances the LLMs' performance for specific tasks through carefully designed input prompts. 3. **Examining the reasoning capabilities of LLMs within the CoT framework**: In particular, explore how CoT improves the logical reasoning capabilities of LLMs and solves alignment problems, such as performance when dealing with sensitive topics like fatal accidents. Through these studies, the paper hopes to reveal the potential value of large language models in the field of traffic safety management, especially their capabilities in handling complex and unstructured data, and how to further improve the performance of these models through technical means such as CoT and PE.