Untagged Table Extraction in Semi-structured Documents

SONG Qiang,XU Peng,Li Juanzi
DOI: https://doi.org/10.3969/j.issn.1000-3428.2005.18.029
2005-01-01
Abstract:Based on the data modeling of the untagged table,this paper proposes an extraction algorithm by using its structural distribution features in documents.It splits the untagged table into rows and columns,and then inducts headers and merges cells.Experimental results indicate that the accuracy of the algorithm is satisfactory.
What problem does this paper attempt to address?