Abstract:As one of the most commonly used and important data carriers, tables have the advantages of high structuring, strong readability and strong flexibility. However, in reality, tables usually present various forms, such as Excel, images, etc. Among them, the information in the table image cannot be read directly, let alone further applied. Therefore, the research related to image-based table recognition is crucial. It contains the table structure recognition and the table content recognition. Among them, table structure recognition is the most important and difficult task because the table structure is abstract and changeable. In order to address this problem, we propose an innovative table structure recognition method, named TSRDet (Table Structure Recognition based on object Detection). It includes a row-column detection method, named SACNet (StripAttention-CenterNet) and the corresponding post-processing. SACNet is an improved version of the original CenterNet. The specific improvements include the following: firstly, we introduce the Swin Transformer as the encoder to obtain the global feature map of the image. Then, we propose a plug-and-play row-column attention module, including a channel attention module and a row-column spatial attention module. It improves the detection accuracy of rows and columns by capturing long-range row-column feature maps in the image. After completing the row-column detection, this paper also designs a simple and fast post-processing to generate the table structure based on the row-column detection results. Experimental results show that for row-column detection, SACNet has high detection accuracy, even at a high IoU threshold. Specifically, when the threshold is 0.75, its mAP of row detection and column detection still exceeds 90%, which is 91.40% and 92.73% respectively. In addition, in the comparative experiment with the existing object detection methods, SACNet's performance was significantly better than that of all others. For table structure recognition, the TEDS-Struct score of TSRDet is 95.7%, which shows competitive performance in table structure recognition, and verifies the rationality and superiority of the proposed method.

Complex Table Structure Recognition in the Wild Using Transformer and Identity Matrix-Based Augmentation

Divide Rows and Conquer Cells: Towards Structure Recognition for Large Tables.

TableFormer: Table Structure Understanding with Transformers

High-Performance Transformers for Table Structure Recognition Need Early Convolutions

TSRFormer: Table Structure Recognition with Transformers

TC-OCR: TableCraft OCR for Efficient Detection & Recognition of Table Structure & Content

Self-Supervised Pre-Training for Table Structure Recognition Transformer

Robust Table Structure Recognition with Dynamic Queries Enhanced Detection Transformer

Complicated Table Structure Recognition

Rethinking Detection Based Table Structure Recognition for Visually Rich Document Images

TabAug: Data Driven Augmentation for Enhanced Table Structure Recognition

TSRDet: A Table Structure Recognition Method Based on Row-Column Detection

TableVLM: Multi-modal Pre-training for Table Structure Recognition

Split, embed and merge: An accurate table structure recognizer

Table Structure Recognition with Conditional Attention

UTTSR: A Novel Non-Structured Text Table Recognition Model Powered by Deep Learning Technology

A Deep Semantic Segmentation Model for Image-based Table Structure Recognition

Multi-Type-TD-TSR -- Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition: from OCR to Structured Table Representations

Table Structure Recognition using Top-Down and Bottom-Up Cues

Multi-Cell Decoder and Mutual Learning for Table Structure and Character Recognition