Abstractive Tabular Dataset Summarization via Knowledge Base Semantic Embeddings

Paul Azunre,Craig Corcoran,David Sullivan,Garrett Honke,Rebecca Ruppel,Sandeep Verma,Jonathon Morgan
DOI: https://doi.org/10.48550/arXiv.1804.01503
2018-04-05
Abstract:This paper describes an abstractive summarization method for tabular data which employs a knowledge base semantic embedding to generate the summary. Assuming the dataset contains descriptive text in headers, columns and/or some augmenting metadata, the system employs the embedding to recommend a subject/type for each text segment. Recommendations are aggregated into a small collection of super types considered to be descriptive of the dataset by exploiting the hierarchy of types in a pre-specified ontology. Using February 2015 Wikipedia as the knowledge base, and a corresponding DBpedia ontology as types, we present experimental results on open data taken from several sources--OpenML, CKAN and <a class="link-external link-http" href="http://data.world" rel="external noopener nofollow">this http URL</a>--to illustrate the effectiveness of the approach.
Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?