Towards reducing hallucination in extracting information from financial reports using Large Language Models

Bhaskarjit Sarmah,Tianjie Zhu,Dhagash Mehta,Stefano Pasquali

2023-10-17

Abstract:For a financial analyst, the question and answer (Q\&A) segment of the company financial report is a crucial piece of information for various analysis and investment decisions. However, extracting valuable insights from the Q\&A section has posed considerable challenges as the conventional methods such as detailed reading and note-taking lack scalability and are susceptible to human errors, and Optical Character Recognition (OCR) and similar techniques encounter difficulties in accurately processing unstructured transcript text, often missing subtle linguistic nuances that drive investor decisions. Here, we demonstrate the utilization of Large Language Models (LLMs) to efficiently and rapidly extract information from earnings report transcripts while ensuring high accuracy transforming the extraction process as well as reducing hallucination by combining retrieval-augmented generation technique as well as metadata. We evaluate the outcomes of various LLMs with and without using our proposed approach based on various objective metrics for evaluating Q\&A systems, and empirically demonstrate superiority of our method.

Computation and Language,Portfolio Management,Statistical Finance,Applications

What problem does this paper attempt to address?

This paper aims to address the hallucination problem that arises when extracting information from the Q&A sections of financial reports. Specifically, the paper explores how to efficiently and accurately extract valuable information from earnings call transcripts using large language models (LLMs), and how to reduce hallucinations by combining retrieval-augmented generation techniques and metadata. The researchers evaluated the performance of different pre-trained LLMs with and without their proposed methods, and assessed the Q&A system based on various objective metrics, empirically demonstrating the effectiveness of the approach. Additionally, the paper discusses how metadata assistance can improve accuracy in multi-document processing scenarios, thereby enhancing the reliability and precision of information extraction.

Towards reducing hallucination in extracting information from financial reports using Large Language Models

Evaluating Large Language Models on Financial Report Summarization: An Empirical Study

Hallucination-minimized Data-to-answer Framework for Financial Decision-makers

Deficiency of Large Language Models in Finance: An Empirical Examination of Hallucination

Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language Models

From Facts to Insights: A Study on the Generation and Evaluation of Analytical Reports for Deciphering Earnings Calls

Extracting Financial Data From Unstructured Sources: Leveraging Large Language Models

Harnessing Earnings Reports for Stock Predictions: A QLoRA-Enhanced LLM Approach

Cost-Effective Hallucination Detection for LLMs

Auto-Generating Earnings Report Analysis via a Financial-Augmented LLM

Data-Centric Financial Large Language Models

ECC Analyzer: Extract Trading Signal from Earnings Conference Calls using Large Language Model for Stock Performance Prediction

RiskLabs: Predicting Financial Risk Using Large Language Model Based on Multi-Sources Data

A Survey of Large Language Models for Financial Applications: Progress, Prospects and Challenges

Beyond Fine-Tuning: Effective Strategies for Mitigating Hallucinations in Large Language Models for Data Analytics

Quantifying Qualitative Insights: Leveraging LLMs to Market Predict

Towards Mitigating Hallucination in Large Language Models via Self-Reflection

Enabling and Analyzing How to Efficiently Extract Information from Hybrid Long Documents with LLMs

Mitigating Entity-Level Hallucination in Large Language Models

Large Language Models in Finance: A Survey