Tablepedia: Automating PDF Table Reading in an Experimental Evidence Exploration and Analytic System

Wenhao Yu,Zongze Li,Qingkai Zeng,Meng Jiang
DOI: https://doi.org/10.1145/3308558.3314118
2019-01-01
Abstract:Web research, data science, and artificial intelligence have been rapidly changing our life and society. Researchers and practitioners in the fields take a large amount of time to read literature and compare existing approaches. It would significantly improve their efficiency if there was a system that extracted and managed experimental evidences (say, a specific method achieves a score of a specific metric on a specific dataset) from tables of paper PDFs for search, exploration, and analytic. We build such a demonstration system, called Tablepedia, that use rule-based and learning-based methods to automate the reading of PDF tables. It has three modules: template recognition, unification, and SQL operations. We implement three functions to facilitate research and practice: (1) finding related methods and datasets, (2) finding top-performing baseline methods, and (3) finding conflicting reported numbers. A pointer to a screencast on Vimeo: https://vimeo.com/310162310
What problem does this paper attempt to address?