Neurosymbolic AI approach to Attribution in Large Language Models

Deepa Tilwani,Revathy Venkataramanan,Amit P. Sheth
2024-09-30
Abstract:Attribution in large language models (LLMs) remains a significant challenge, particularly in ensuring the factual accuracy and reliability of the generated outputs. Current methods for citation or attribution, such as those employed by tools like <a class="link-external link-http" href="http://Perplexity.ai" rel="external noopener nofollow">this http URL</a> and Bing Search-integrated LLMs, attempt to ground responses by providing real-time search results and citations. However, so far, these approaches suffer from issues such as hallucinations, biases, surface-level relevance matching, and the complexity of managing vast, unfiltered knowledge sources. While tools like <a class="link-external link-http" href="http://Perplexity.ai" rel="external noopener nofollow">this http URL</a> dynamically integrate web-based information and citations, they often rely on inconsistent sources such as blog posts or unreliable sources, which limits their overall reliability. We present that these challenges can be mitigated by integrating Neurosymbolic AI (NesyAI), which combines the strengths of neural networks with structured symbolic reasoning. NesyAI offers transparent, interpretable, and dynamic reasoning processes, addressing the limitations of current attribution methods by incorporating structured symbolic knowledge with flexible, neural-based learning. This paper explores how NesyAI frameworks can enhance existing attribution models, offering more reliable, interpretable, and adaptable systems for LLMs.
Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to achieve reliable attribution in large - language models (LLMs). Specifically, current LLMs have the following main problems when generating content: 1. **Factual accuracy**: LLMs often generate fictional information (hallucinations) that cannot be traced back to reliable sources, resulting in a lack of factual basis in the generated content. 2. **Reliability**: Existing citation methods such as those integrated with Perplexity.ai and Bing Search in LLMs, although attempting to provide support through real - time search results and citations, rely on inconsistent sources, such as blog posts or unreliable information sources, which limits their overall reliability. 3. **Surface - relevance matching**: Existing methods often rely on surface - relevance matching when selecting citation sources, while ignoring the quality and reliability of the sources. 4. **Complexity**: Managing large, unfiltered knowledge sources is very complex, leading to bias and imprecision problems in the citation process. 5. **Legal and ethical compliance**: LLMs lack proper citation when using copyrighted materials, which may lead to copyright infringement and intellectual property issues. To address these problems, the paper proposes a method based on Neurosymbolic AI (NesyAI). NesyAI combines the statistical capabilities of neural networks with the structured knowledge of symbolic reasoning, aiming to create a more reliable, transparent, and interpretable citation system. Specifically, NesyAI solves the above problems in the following ways: - **Structured knowledge representation**: Use knowledge graphs (KGs) and ontologies to improve citation processing, ensuring that every generated piece of information has a verifiable source. - **Dynamic knowledge update**: Through the integration of dynamic memory structures and real - time data sources, ensure that the system can reflect the latest knowledge. - **Metacognitive layer**: Through the metacognitive layer, intelligently decide when and how to combine neural networks and symbolic reasoning to improve the accuracy and reliability of citations. - **Explanatory and traceability**: Provide a transparent decision - making process, enabling users to trace any generated statement to its source and understand its reasoning process. - **Dynamic knowledge update**: By integrating knowledge graphs and real - time data sources, ensure that the citation system can adapt to a rapidly changing information environment. In summary, the goal of this paper is to improve the citation reliability, transparency, and interpretability of LLMs when generating content through the NesyAI framework, especially in high - risk fields such as medicine and law.