Summarizing Relational Database Schema Based On Label Propagation

Xiaojie Yuan,Xinkun Li,Man Yu,Xiangrui Cai,Ying Zhang,Yanlong Wen
DOI: https://doi.org/10.1007/978-3-319-11116-2_23
2014-01-01
Abstract:Real enterprise databases are usually composed of hundreds of tables, which make querying a complex database a really hard task for unprofessional users, especially when lack of documentation. Schema summarization helps to improve the usability of databases and provides a succinct overview of the entire schema. In this paper, we introduce a novel three-step schema summarization method based on label propagation. First, we exploit varied similarity properties in database schema and propose a measure of table similarity based on Radial Basis Function Kernel, which measures similarity properties comprehensively. Second, we find representative tables as labeled data and annotate the labeled schema graph. Finally, we use label propagation algorithm on the labeled schema graph to classify database schema and create a schema summary. Extensive evaluations demonstrate the effectiveness of our approach.
What problem does this paper attempt to address?