Web Table Column Type Detection Using Deep Learning and Probability Graph Model.

Tong Guo,Derong Shen,Tiezheng Nie,Yue Kou
DOI: https://doi.org/10.1007/978-3-030-60029-7_37
2020-01-01
Abstract:The rich knowledge contains on the web plays an important role in the researches and practical applications including web search, multi-question answering, and knowledge base construction. How to correctly detect the semantic types of all the data columns is critical to understand the web table. The traditional methods have the following limitations: (1) Most of them rely on dictionary lookup and regular expression matching, and are generally not robust to dirty data; (2) They only consider character data besides numeric data which accounts for a large proportion; (3) Some models take the characteristics of a single column and do not consider the special organizational structure of the table. In this paper, a column type detection method combining deep learning and probability graph model is proposed, taking the semantic features of a single column and the interaction between multiple columns into account to improve the prediction accuracy. Experimental results show that our method has higher accuracy compared with the state-of-the-art approaches.
What problem does this paper attempt to address?