A Deep Semantic Segmentation Model for Image-based Table Structure Recognition

Yajun Zou,Jinwen Ma
DOI: https://doi.org/10.1109/icsp48669.2020.9321003
2020-01-01
Abstract:Table structure recognition is a crucial step for automatic table information extraction. It is conventional to utilize the features such as ruling lines or words for parsing the rows, columns and cells in a table. However, these conventional methods are ineffective for image-based tables when ruling lines are not visible or the words cannot be recognized through the OCR system. In order to overcome these problems, we propose a deep semantic segmentation model for image-based table structure recognition. Specifically, it is an end-to-end semantic segmentation neural network to determine a pixel-wise prediction map for an input table image where the labels are row separator, column separator, cell content and background. Moreover, by making the connected componnet analysis on the prediction map, we can obtain the bounding boxes of row separators, column separators and cell contents, more accurately. Then we number row/column separators in order by coordinate sorting. Thus, we can make full use of relative positions between row/column separators and cell contents, and further assign the row/column number to each cell. Due to the lack of training data, a large amount of synthetic data are automatically generated in our experiments. It is demonstrated by the experimental results that our proposed model is suitable for various table types, which can achieve 0.9769 and 0.9343 average F1 scores on a generative dataset when the IoU threshold is set to 0.6 and 0.8, respectively.
What problem does this paper attempt to address?