Fostering the integration of European Open Data into Data Spaces through High-Quality Metadata

Javier Conde,Alejandro Pozo,Andrés Munoz-Arcentales,Johnny Choque,Álvaro Alonso
2024-02-09
Abstract:The term Data Space, understood as the secure exchange of data in distributed systems, ensuring openness, transparency, decentralization, sovereignty, and interoperability of information, has gained importance during the last years. However, Data Spaces are in an initial phase of definition, and new research is necessary to address their requirements. The Open Data ecosystem can be understood as one of the precursors of Data Spaces as it provides mechanisms to ensure the interoperability of information through resource discovery, information exchange, and aggregation via metadata. However, Data Spaces require more advanced capabilities including the automatic and scalable generation and publication of high-quality metadata. In this work, we present a set of software tools that facilitate the automatic generation and publication of metadata, the modeling of datasets through standards, and the assessment of the quality of the generated metadata. We validate all these tools through the YODA Open Data Portal showing how they can be connected to integrate Open Data into Data Spaces.
Databases
What problem does this paper attempt to address?
The problem discussed in this paper is how to promote the better integration of European open data into Data Spaces. In the current society, which is data-driven, the data economy is constantly growing, and Data Spaces, as an infrastructure that ensures data security exchange, openness, transparency, decentralization, and interoperability, are becoming increasingly important. Open data is the precursor to Data Spaces, providing mechanisms for data exchange and interoperability. However, in order to meet the requirements of Data Spaces, advanced features such as automated, scalable high-quality metadata generation and publication are needed. In the paper, the authors developed a series of software tools to automate data publishing, generate metadata compliant with the DCAT standard, and assess metadata quality. These tools were validated through the YODA open data portal, improving metadata quality and making YODA the best open data portal in Europe. The research analyzed the obstacles faced by the open data ecosystem, proposed solutions, and validated the effectiveness of the proposed tools through real-world scenarios. The paper also reviewed the development history of open data and Data Spaces, related concepts and technologies such as open data portals, metadata, data quality management, and the FAIR principles. Finally, the paper discussed the role of European open data in Data Spaces, emphasizing the importance of high-quality metadata for seamless integration, and presented future research directions.