TableZa -- A classical Computer Vision approach to Tabular Extraction

Saumya Banthia,Anantha Sharma,Ravi Mangipudi
DOI: https://doi.org/10.48550/arXiv.2105.09137
2021-05-19
Computation and Language
Abstract:Computer aided Tabular Data Extraction has always been a very challenging and error prone task because it demands both Spectral and Spatial Sanity of data. In this paper we discuss an approach for Tabular Data Extraction in the realm of document comprehension. Given the different kinds of the Tabular formats that are often found across various documents, we discuss a novel approach using Computer Vision for extraction of tabular data from images or vector pdf(s) converted to image(s).
What problem does this paper attempt to address?