A Distributed Data Fabric Architecture Based on Metadate Knowledge Graph

Xinchi Li,Mingchuan Yang,Xiaoqing Xia,Kaicheng Zhang,Kang Liu
DOI: https://doi.org/10.1109/dsit55514.2022.9943831
2022-01-01
Abstract:Just like talent, capital and land, data is becoming one of the market-oriented elements and driving profound intelligent changes in traditional industries. Data governance is a key process to improve the availability, integrity, and security of data. For large group enterprises like telecom operators, it is not simple to manage the massive data scattered in different departments, different facilities, different regions, and different systems. This paper creatively proposes a distributed data fabric architecture based on metadate knowledge graph to solve this problem. The functional architecture consists of six layers, including acquisition storage layer, data management layer, knowledge graph layer, data directory layer, model application layer, user request layer, etc. A feasible framework is designed to guide how to implement this architecture, and several key enabling technologies have been studied. Instead of storing the full amount of data in a centralized data lake, we choose to centralize the metadata into the data warehouse. This can avoid the problems of data privacy disclosure, long data turnover time, wrong data entry into the lake and high operation and maintenance cost. Specifically, the enhanced data knowledge directory derived from the lower-level global metadata knowledge map is constructed, including static data resource directory, static business resource directory, and dynamic enhanced knowledge directory automatically generated based on user input. The proposed architecture has been used in the practical application in China Telecom Group and alleviates the difficulty of massive data query to a certain extent.
What problem does this paper attempt to address?