TC-OCR: TableCraft OCR for Efficient Detection & Recognition of Table Structure & Content

Avinash Anand,Raj Jaiswal,Pijush Bhuyan,Mohit Gupta,Siddhesh Bangar,Md. Modassir Imam,Rajiv Ratn Shah,Shin'ichi Satoh

DOI: https://doi.org/10.1145/3606040.3617444

2024-04-19

Abstract:The automatic recognition of tabular data in document images presents a significant challenge due to the diverse range of table styles and complex structures. Tables offer valuable content representation, enhancing the predictive capabilities of various systems such as search engines and Knowledge Graphs. Addressing the two main problems, namely table detection (TD) and table structure recognition (TSR), has traditionally been approached independently. In this research, we propose an end-to-end pipeline that integrates deep learning models, including DETR, CascadeTabNet, and PP OCR v2, to achieve comprehensive image-based table recognition. This integrated approach effectively handles diverse table styles, complex structures, and image distortions, resulting in improved accuracy and efficiency compared to existing methods like Table Transformers. Our system achieves simultaneous table detection (TD), table structure recognition (TSR), and table content recognition (TCR), preserving table structures and accurately extracting tabular data from document images. The integration of multiple models addresses the intricacies of table recognition, making our approach a promising solution for image-based table understanding, data extraction, and information retrieval applications. Our proposed approach achieves an IOU of 0.96 and an OCR Accuracy of 78%, showcasing a remarkable improvement of approximately 25% in the OCR Accuracy compared to the previous Table Transformer approach.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the significant challenges faced in automatically recognizing tabular data in document images, mainly due to the diversity of table styles and the complexity of structures. Tables provide valuable content representations in various systems, such as search engines and knowledge graphs. Therefore, accurate detection and structure recognition of tables are crucial for improving the predictive capabilities of these systems. However, traditional research usually treats table detection (TD) and table structure recognition (TSR) as independent problems, which limits the overall efficiency and accuracy. To overcome these problems, the author proposes an end - to - end pipeline that integrates deep - learning models (including DETR, CascadeTabNet, and PP - OCR v2) to achieve comprehensive image - based table recognition. This integrated method can effectively handle different table styles, complex structures, and image distortion problems commonly found in document images, thereby improving accuracy and efficiency. Specifically, this system can simultaneously perform table detection (TD), table structure recognition (TSR), and table content recognition (TCR), preserve the table structure, and accurately extract tabular data from document images. The main contributions of the paper include: 1. Proposing a novel integrated pipeline that combines three state - of - the - art models to achieve end - to - end table recognition from image - based data. 2. Through rigorous experiments and evaluations, it is proven that the integrated pipeline is superior to existing methods in terms of the accuracy and efficiency of table recognition, especially in handling complex table structures and accurately extracting tabular data. In conclusion, this paper aims to overcome the current challenges in table analysis and recognition and improve the extraction and understanding capabilities of tabular data by proposing an innovative solution.

TC-OCR: TableCraft OCR for Efficient Detection & Recognition of Table Structure & Content

Multi-Type-TD-TSR -- Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition: from OCR to Structured Table Representations

TableFormer: Table Structure Understanding with Transformers

UTTSR: A Novel Non-Structured Text Table Recognition Model Powered by Deep Learning Technology

TableDet: An end-to-end deep learning approach for table detection and table image classification in data sheet images

High-Performance Transformers for Table Structure Recognition Need Early Convolutions

Table of Contents Recognition in OCR Documents using Image-based Machine Learning

Table Structure Recognition with Conditional Attention

Table Structure Recognition using Top-Down and Bottom-Up Cues

TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images

Rethinking Detection Based Table Structure Recognition for Visually Rich Document Images

CasTabDetectoRS: Cascade Network for Table Detection in Document Images with Recursive Feature Pyramid and Switchable Atrous Convolution

TSRDet: A Table Structure Recognition Method Based on Row-Column Detection

A Conglomerate of Multiple OCR Table Detection and Extraction

Image-based table recognition: data, model, and evaluation

Divide Rows and Conquer Cells: Towards Structure Recognition for Large Tables.

Robust Table Structure Recognition with Dynamic Queries Enhanced Detection Transformer

HybridTabNet: Towards Better Table Detection in Scanned Document Images

TabAug: Data Driven Augmentation for Enhanced Table Structure Recognition