Data Quality Management In Institutional Research Output Data Center

Xiaohua Shi,Zhuoyuan Xing,Hongtao Lu
DOI: https://doi.org/10.1007/978-3-030-18590-9_10
2019-01-01
Abstract:Institutional research output data center will store normative and convinced scholar's research output data, and it will effectively support dynamic presentation of research output, reveal institutional academic publication in multiple dimensions, advance open access, and provide data support for subject evaluation and discipline development.In this paper, we propose a data quality management framework to build institutional research output data center, and put forward relevant technical solution for different data governance problems, such as department name similarity estimation in data matching, author name disam-biguous problem in data merging and security issue in data exchange. We also introduce some learning algorithms such as text distance and community detection with matrix factorization. Comparing with different ways, our methods achieve good performance in quality manage processing.
What problem does this paper attempt to address?