Public Data Integration with WebSmatch

R. Coletta,E. Castanier,P. Valduriez,C. Frisch,D. Ngo,Z. Bellahsene
DOI: https://doi.org/10.48550/arXiv.1205.2555
2012-05-15
Abstract:Integrating open data sources can yield high value information but raises major problems in terms of metadata extraction, data source integration and visualization of integrated data. In this paper, we describe WebSmatch, a flexible environment for Web data integration, based on a real, end-to-end data integration scenario over public data from Data Publica. WebSmatch supports the full process of importing, refining and integrating data sources and uses third party tools for high quality visualization. We use a typical scenario of public data integration which involves problems not solved by currents tools: poorly structured input data sources (XLS files) and rich visualization of integrated data.
Digital Libraries
What problem does this paper attempt to address?