TabSniper: Towards Accurate Table Detection & Structure Recognition for Bank Statements

Abhishek Trivedi,Sourajit Mukherjee,Rajat Kumar Singh,Vani Agarwal,Sriranjani Ramakrishnan,Himanshu S. Bhatt
2024-12-17
Abstract:Extraction of transaction information from bank statements is required to assess one's financial well-being for credit rating and underwriting decisions. Unlike other financial documents such as tax forms or financial statements, extracting the transaction descriptions from bank statements can provide a comprehensive and recent view into the cash flows and spending patterns. With multiple variations in layout and templates across several banks, extracting transactional level information from different table categories is an arduous task. Existing table structure recognition approaches produce sub optimal results for long, complex tables and are unable to capture all transactions accurately. This paper proposes TabSniper, a novel approach for efficient table detection, categorization and structure recognition from bank statements. The pipeline starts with detecting and categorizing tables of interest from the bank statements. The extracted table regions are then processed by the table structure recognition model followed by a post-processing module to transform the transactional data into a structured and standardised format. The detection and structure recognition architectures are based on DETR, fine-tuned with diverse bank statements along with additional feature enhancements. Results on challenging datasets demonstrate that TabSniper outperforms strong baselines and produces high-quality extraction of transaction information from bank and other financial documents across multiple layouts and templates.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper attempts to solve the problem of accurately extracting transaction information from bank statements. Specifically, it aims to overcome the limitations of existing table structure recognition methods, which are ineffective in handling long and complex tables and unable to accurately capture all transaction information. To achieve this goal, the paper proposes a new method named TabSniper, which focuses on efficiently detecting, classifying, and recognizing table structures from bank statements. ### Main Problem Description 1. **Diverse Layouts and Templates**: Bank statements from different banks have significant differences in layout and template, making it difficult to process them uniformly. 2. **Complex Table Structures**: Tables in bank statements are usually long and have complex structures, containing multiple rows or columns, which makes it difficult for traditional table structure recognition methods to handle. 3. **Variations in Information Representation**: Different bank statements have great differences in information representation methods and table structures, increasing the difficulty of information extraction. 4. **Differences between Scanned and Electronic Files**: Scanned PDF files and electronically generated PDF files also present different challenges in processing. ### Solution Overview TabSniper solves the above problems through the following steps: 1. **Table Detection and Classification (TDC)**: - Use a DETR - based model to detect and classify tables in statements. - Combine visual and text models to ensure the accuracy of table classification. 2. **Table Structure Recognition (TSR)**: - Recognize the structure of the extracted table areas and determine the row and column boundaries of the tables. - Introduce an improved loss function (such as CIoU) to improve the matching accuracy between table objects and real labels. 3. **Post - processing**: - Apply heuristic rules to adjust the table structure, reduce the overlap between objects, and ensure the accuracy of the table structure. - Extract the text in table cells and convert it into a standardized format. ### Key Contributions - **The First End - to - End Framework**: TabSniper is the first end - to - end automated algorithm for bank statement table structure recognition. - **Table Classification Technology**: Propose a new table classification technology to classify bank tables into predefined categories. - **Accurate Table Structure Recognition**: Introduce innovative data preparation and modeling techniques, especially suitable for handling long tables and overlapping table objects. - **BankTabNet Dataset**: Create an annotated database specifically for bank statement table structure recognition, including page - level and transaction - level annotations. Through these methods, TabSniper can efficiently and accurately extract transaction information from bank statements under various layouts and templates.