TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness

Danna Zheng,Danyang Liu,Mirella Lapata,Jeff Z. Pan
2024-02-20
Abstract:Large Language Models (LLMs) have demonstrated impressive capabilities across various domains, prompting a surge in their practical applications. However, concerns have arisen regarding the trustworthiness of LLMs outputs, particularly in closed-book question-answering tasks, where non-experts may struggle to identify inaccuracies due to the absence of contextual or ground truth information. This paper introduces TrustScore, a framework based on the concept of Behavioral Consistency, which evaluates whether an LLMs response aligns with its intrinsic knowledge. Additionally, TrustScore can seamlessly integrate with fact-checking methods, which assesses alignment with external knowledge sources. The experimental results show that TrustScore achieves strong correlations with human judgments, surpassing existing reference-free metrics, and achieving results on par with reference-based metrics.
Computation and Language
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the issue of evaluating the trustworthiness of outputs from large language models (LLMs) in closed-book question-answering tasks. Specifically, while LLMs perform excellently in various tasks, in closed-book question-answering tasks, it is difficult for non-expert users to identify inaccurate answers generated by the model due to the lack of context or real information. This leads to concerns about the trustworthiness of LLM outputs. ### Background and Challenges 1. **Capabilities and Applications of LLMs**: - Large language models (LLMs) have shown outstanding performance in natural language processing (NLP) tasks, driving their widespread use in practical applications. - However, these models sometimes generate responses that seem reasonable but are actually incorrect, a problem that is particularly prominent in closed-book question-answering tasks. 2. **Challenges of Closed-Book Question-Answering Tasks**: - In closed-book question-answering tasks, LLMs rely solely on their parameter knowledge to generate answers, without the support of external context or real information. - This makes it very difficult to evaluate the trustworthiness of LLM outputs, especially for non-expert users. ### Solution To address these challenges, the paper introduces the **TrustScore** framework, which is based on the concept of **behavioral consistency** to evaluate whether the LLM's responses are consistent with its internal knowledge. Additionally, TrustScore can seamlessly integrate fact-checking methods to further assess the consistency of responses with external knowledge sources. ### Main Contributions 1. **Behavioral Consistency Evaluation**: - Through multiple-choice tests, evaluate whether the LLM maintains consistent choices in its responses and other distractor options. - If the LLM consistently chooses the same answer across multiple tests, its response is considered consistent with its internal knowledge, thereby increasing trustworthiness. 2. **Fact-Checking Integration**: - When external knowledge bases are available, TrustScore can be combined with fact-checking modules to further verify the accuracy of responses. - This dual approach ensures a comprehensive evaluation of LLM responses, considering both internal consistency and external factual consistency. 3. **Experimental Results**: - Experimental results show that TrustScore has a strong correlation with human judgment, surpassing existing reference-free metrics and performing close to reference-based metrics. ### Conclusion TrustScore provides a novel reference-free evaluation framework that effectively assesses the trustworthiness of LLM responses. The framework performs excellently in closed-book question-answering tasks, not only independently evaluating behavioral consistency but also integrating with fact-checking methods to provide more comprehensive evaluation results.