Abstract:As one of the most commonly used and important data carriers, tables have the advantages of high structuring, strong readability and strong flexibility. However, in reality, tables usually present various forms, such as Excel, images, etc. Among them, the information in the table image cannot be read directly, let alone further applied. Therefore, the research related to image-based table recognition is crucial. It contains the table structure recognition and the table content recognition. Among them, table structure recognition is the most important and difficult task because the table structure is abstract and changeable. In order to address this problem, we propose an innovative table structure recognition method, named TSRDet (Table Structure Recognition based on object Detection). It includes a row-column detection method, named SACNet (StripAttention-CenterNet) and the corresponding post-processing. SACNet is an improved version of the original CenterNet. The specific improvements include the following: firstly, we introduce the Swin Transformer as the encoder to obtain the global feature map of the image. Then, we propose a plug-and-play row-column attention module, including a channel attention module and a row-column spatial attention module. It improves the detection accuracy of rows and columns by capturing long-range row-column feature maps in the image. After completing the row-column detection, this paper also designs a simple and fast post-processing to generate the table structure based on the row-column detection results. Experimental results show that for row-column detection, SACNet has high detection accuracy, even at a high IoU threshold. Specifically, when the threshold is 0.75, its mAP of row detection and column detection still exceeds 90%, which is 91.40% and 92.73% respectively. In addition, in the comparative experiment with the existing object detection methods, SACNet's performance was significantly better than that of all others. For table structure recognition, the TEDS-Struct score of TSRDet is 95.7%, which shows competitive performance in table structure recognition, and verifies the rationality and superiority of the proposed method.

TOC Structure Extraction from OCR-ed Books.

Extracting Web Content by Exploiting Multi-Category Characteristics

Table of Contents Recognition in OCR Documents using Image-based Machine Learning

Exploiting Multi-Category Characteristics and Unified Framework to Extract Web Content

Hierarchical Logical Structure Extraction of Book Documents by Analyzing Tables of Contents

Structure extraction from PDF-based book documents.

Analysis of Book Documents' Table of Content Based on Clustering

Web Information Segmentation Method Based on DOM Structure Tree

Table Structure Recognition using Top-Down and Bottom-Up Cues

A Table of Content Recognition Method of Book Documents Based on Clustering Techniques

TC-OCR: TableCraft OCR for Efficient Detection & Recognition of Table Structure & Content

ClusTi: Clustering Method for Table Structure Recognition in Scanned Images

An OpenCV-based Framework for Table Information Extraction

A Scalable Framework for Table of Contents Extraction from Complex ESG Annual Reports

TSRDet: A Table Structure Recognition Method Based on Row-Column Detection

Multimodal Tree Decoder for Table of Contents Extraction in Document Images.

A Conglomerate of Multiple OCR Table Detection and Extraction

Table of Content detection using Machine Learning

Complicated Table Structure Recognition

Table Structure Recognition with Conditional Attention

CMSOF: a Structured Data Organization Framework for Scanned Chinese Medicine Books in Digital Libraries