Knowledge Graph-based Data Transformation Recommendation Engine

Satoru Watanabe,Garima Natani
DOI: https://doi.org/10.1109/BigData52589.2021.9671905
2021-12-15
Abstract:Demand for data transformation has increased with the rapid growth of data. To ensure improvements in production line efficiency, Internet of Things (IoT) data analytics demand data transformations that include changing the format, structure, or values of the data stored in a data lake. Data transformations are generally not reused due to the complexity of data transformation files and the lack of knowledge on previous data transformation flows developed by other Extract Transform Load (ETL) developers who develop data transformations. For a naive developer, it is time consuming and difficult to find relevant existing data transformations that can be modified as per new requirements. To solve this problem, we developed a knowledge graph-based data transformation recommendation system featuring a data similarity component that helps to provide explainable results. This system, can improve the mean average precision 26% and the mean average recall by 24% while reducing the mean average root mean squared error by 69% which implies that it can significantly increase the effectiveness of the of recommendations. With the help of this recommendation engine, data transformation tasks can be done by naive developers with little knowledge on existing data transformation flows and continuous improvement projects can be speeded up.
Computer Science
What problem does this paper attempt to address?