Abstract:As one of the most commonly used and important data carriers, tables have the advantages of high structuring, strong readability and strong flexibility. However, in reality, tables usually present various forms, such as Excel, images, etc. Among them, the information in the table image cannot be read directly, let alone further applied. Therefore, the research related to image-based table recognition is crucial. It contains the table structure recognition and the table content recognition. Among them, table structure recognition is the most important and difficult task because the table structure is abstract and changeable. In order to address this problem, we propose an innovative table structure recognition method, named TSRDet (Table Structure Recognition based on object Detection). It includes a row-column detection method, named SACNet (StripAttention-CenterNet) and the corresponding post-processing. SACNet is an improved version of the original CenterNet. The specific improvements include the following: firstly, we introduce the Swin Transformer as the encoder to obtain the global feature map of the image. Then, we propose a plug-and-play row-column attention module, including a channel attention module and a row-column spatial attention module. It improves the detection accuracy of rows and columns by capturing long-range row-column feature maps in the image. After completing the row-column detection, this paper also designs a simple and fast post-processing to generate the table structure based on the row-column detection results. Experimental results show that for row-column detection, SACNet has high detection accuracy, even at a high IoU threshold. Specifically, when the threshold is 0.75, its mAP of row detection and column detection still exceeds 90%, which is 91.40% and 92.73% respectively. In addition, in the comparative experiment with the existing object detection methods, SACNet's performance was significantly better than that of all others. For table structure recognition, the TEDS-Struct score of TSRDet is 95.7%, which shows competitive performance in table structure recognition, and verifies the rationality and superiority of the proposed method.

Divide Rows and Conquer Cells: Towards Structure Recognition for Large Tables.

High-Performance Transformers for Table Structure Recognition Need Early Convolutions

Robust Table Structure Recognition with Dynamic Queries Enhanced Detection Transformer

Rethinking Detection Based Table Structure Recognition for Visually Rich Document Images

TableFormer: Table Structure Understanding with Transformers

TSRDet: A Table Structure Recognition Method Based on Row-Column Detection

Self-Supervised Pre-Training for Table Structure Recognition Transformer

Table Structure Recognition with Conditional Attention

UTTSR: A Novel Non-Structured Text Table Recognition Model Powered by Deep Learning Technology

Complicated Table Structure Recognition

Improving Table Structure Recognition with Visual-Alignment Sequential Coordinate Modeling

A Deep Semantic Segmentation Model for Image-based Table Structure Recognition

TC-OCR: TableCraft OCR for Efficient Detection & Recognition of Table Structure & Content

UniTable: Towards a Unified Framework for Table Structure Recognition via Self-Supervised Pretraining

Split, embed and merge: An accurate table structure recognizer

Multi-Type-TD-TSR -- Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition: from OCR to Structured Table Representations

Table-to-Text Generation with Effective Hierarchical Encoder on Three Dimensions (Row, Column and Time)

Parsing Table Structures in the Wild

Multi-Cell Decoder and Mutual Learning for Table Structure and Character Recognition

Rethinking Table Structure Recognition Using Sequence Labeling Methods