Abstract:As one of the most commonly used and important data carriers, tables have the advantages of high structuring, strong readability and strong flexibility. However, in reality, tables usually present various forms, such as Excel, images, etc. Among them, the information in the table image cannot be read directly, let alone further applied. Therefore, the research related to image-based table recognition is crucial. It contains the table structure recognition and the table content recognition. Among them, table structure recognition is the most important and difficult task because the table structure is abstract and changeable. In order to address this problem, we propose an innovative table structure recognition method, named TSRDet (Table Structure Recognition based on object Detection). It includes a row-column detection method, named SACNet (StripAttention-CenterNet) and the corresponding post-processing. SACNet is an improved version of the original CenterNet. The specific improvements include the following: firstly, we introduce the Swin Transformer as the encoder to obtain the global feature map of the image. Then, we propose a plug-and-play row-column attention module, including a channel attention module and a row-column spatial attention module. It improves the detection accuracy of rows and columns by capturing long-range row-column feature maps in the image. After completing the row-column detection, this paper also designs a simple and fast post-processing to generate the table structure based on the row-column detection results. Experimental results show that for row-column detection, SACNet has high detection accuracy, even at a high IoU threshold. Specifically, when the threshold is 0.75, its mAP of row detection and column detection still exceeds 90%, which is 91.40% and 92.73% respectively. In addition, in the comparative experiment with the existing object detection methods, SACNet's performance was significantly better than that of all others. For table structure recognition, the TEDS-Struct score of TSRDet is 95.7%, which shows competitive performance in table structure recognition, and verifies the rationality and superiority of the proposed method.

A Deep Semantic Segmentation Model for Image-based Table Structure Recognition

TSRDet: A Table Structure Recognition Method Based on Row-Column Detection

Rethinking Table Structure Recognition Using Sequence Labeling Methods

Split, embed and merge: An accurate table structure recognizer

SEMv2: Table separation line detection based on instance segmentation

Divide Rows and Conquer Cells: Towards Structure Recognition for Large Tables.

Improving Table Structure Recognition with Visual-Alignment Sequential Coordinate Modeling

SEMv3: A Fast and Robust Approach to Table Separation Line Detection

Robust Table Detection and Structure Recognition from Heterogeneous Document Images

Table Structure Recognition using Top-Down and Bottom-Up Cues

Image-based table recognition: data, model, and evaluation

TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images

TableSegNet: a fully convolutional network for table detection and segmentation in document images

A Review On Table Recognition Based On Deep Learning

Complicated Table Structure Recognition

A Method of Evaluating Table Segmentation Results Based on a Table Image Ground Truther

TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition

UTTSR: A Novel Non-Structured Text Table Recognition Model Powered by Deep Learning Technology

Table Structure Extraction with Bi-directional Gated Recurrent Unit Networks