Graph Retrieval Augmented Trustworthiness Reasoning
Ying Zhu,Shengchang Li,Ziqian Kong,Peilan Xu
2024-09-04
Abstract:Trustworthiness reasoning is crucial in multiplayer games with incomplete information, enabling agents to identify potential allies and adversaries, thereby enhancing reasoning and decision-making processes. Traditional approaches relying on pre-trained models necessitate extensive domain-specific data and considerable reward feedback, with their lack of real-time adaptability hindering their effectiveness in dynamic environments. In this paper, we introduce the Graph Retrieval Augmented Reasoning (GRATR) framework, leveraging the Retrieval-Augmented Generation (RAG) technique to bolster trustworthiness reasoning in agents. GRATR constructs a dynamic trustworthiness graph, updating it in real-time with evidential information, and retrieves relevant trust data to augment the reasoning capabilities of Large Language Models (LLMs). We validate our approach through experiments on the multiplayer game "Werewolf," comparing GRATR against baseline LLM and LLM enhanced with Native RAG and Rerank RAG. Our results demonstrate that GRATR surpasses the baseline methods by over 30\% in winning rate, with superior reasoning performance. Moreover, GRATR effectively mitigates LLM hallucinations, such as identity and objective amnesia, and crucially, it renders the reasoning process more transparent and traceable through the use of the trustworthiness graph.
Artificial Intelligence
What problem does this paper attempt to address?
### What problem does this paper attempt to solve?
This paper aims to solve the problem of how to effectively evaluate the credibility of players in multi - player games with incomplete information. Specifically, it has made improvements in the following aspects:
1. **Limitations of traditional methods**:
- Traditional pre - training - model - based methods require a large amount of domain - specific data and reward feedback, and lack real - time adaptability, which limits their effectiveness in dynamic environments.
- Although large language models (LLMs) perform well in natural language understanding and generation, they are prone to hallucinations (such as identity and goal forgetting) during the reasoning process.
2. **The need for real - time credibility assessment**:
- In multi - player games, players' behaviors and statements change over time. Therefore, a method that can collect and analyze this information in real - time is required to evaluate the credibility of players.
- Existing retrieval - augmented generation (RAG) methods mainly focus on the retrieval and integration of static external data and cannot effectively capture real - time interactions and changes in trust relationships.
3. **Improving reasoning performance and transparency**:
- A method is needed to enhance the reasoning ability of LLMs so that they can make more accurate decisions in complex game environments.
- At the same time, this method should be able to improve the transparency and traceability of the reasoning process, making the reasoning results more reliable.
To solve the above problems, this paper proposes the Graph Retrieval - Augmented Trust Reasoning (GRATR) framework. GRATR enhances the reasoning ability of LLMs by constructing a dynamic trust relationship graph, updating evidence in real - time, and retrieving relevant trust data. Experimental results show that GRATR significantly improves the winning rate in the "Werewolf" game, effectively alleviates the hallucination problem of LLMs, and makes the reasoning process more transparent and traceable.
### Key contributions
- Proposed a new GRATR framework, which enhances the reasoning ability of LLMs by constructing a dynamic trust relationship graph to update and retrieve evidence in real - time.
- Verified by experiments, GRATR performs better than baseline methods and other enhancement methods (such as Native RAG and Rerank RAG) in the "Werewolf" game, with a winning rate increase of more than 30%.
- GRATR not only improves reasoning performance but also alleviates the hallucination problem of LLMs and makes the reasoning process more transparent and traceable.
### Formula summary
1. **Trust value update formula**:
\[
u_t^i(p_k) = T_t^i(p_j)\cdot|w_t^i(p_j, p_k)|\cdot c_t^i(p_j, p_k)
\]
\[
T_{t + 1}^i(p_k)=\begin{cases}
T_t^i(p_k), &\text{if }|u_t^i(p_k)|\leq|T_t^i(p_k)|\\
u_t^i(p_k), &\text{if }|u_t^i(p_k)| > |T_t^i(p_k)|
\end{cases}
\]
2. **Edge weight update formula**:
\[
\tau_{t+1}^i(p_j, p_k)=\tanh\left(\sum_{k = 1}^n\rho^{n - k}\cdot w_t^i(p_j, p_k)\right)
\]
3. **Trust chain cumulative update formula**:
\[
V_{C_n}=\sum_{k = 1}^{o - 1}T_t^i(p_{k+1})\cdot\tau_t^i(p_{k+1}, p_k)
\]
\[
u_t^i(p_o)=T_t^i(p_1)\cdot\prod_{k = 1}^{o - 1}\tau_t^i(p_{k+1}, p_k)
\]
\[
H(C_n)=-u_t^i(p_o)\log_2 u_t^i(p_o)
\]
\[
T_{t+1}^i(p_o)=\fr