A Secure Multi-party Data Federation System

LI Shu-Yuan,JI Yu-Dian,SHI Ding-Yuan,LIAO Wang-Dong,ZHANG Li-Peng,TONG Yong-Xin,XU Ke
DOI: https://doi.org/10.21655/ijsi.1673-7288.00273
2022-01-01
International Journal of Software and Informatics
Abstract:PDF HTML XML Export Cite reminder A Secure Multi-party Data Federation System DOI: 10.21655/ijsi.1673-7288.00273 Author: Affiliation: Clc Number: Fund Project: National Key Research and Development Program of China (2018AAA0101100); National Natural Science Foundation of China (61822201, U1811463, 62076017); the CCF-Huawei Database System Innovation Research Plan (CCF-HuaweiDBIR2020008B); State Key Laboratory of Software Development Environment (Beihang University) Open Program (SKLSDE-2020ZX-07) Article | Figures | Metrics | Reference | Related | Cited by | Materials | Comments Abstract:In the era of big data, data is of great value as an essential factor in production. It is of great significance to implement its analysis, mining, and utilization of large-scale data via data sharing. However, due to the heterogeneous dispersion of data and increasingly rigorous privacy protection regulations, data owners cannot arbitrarily share data, and thus data owners are turned into data silos. Since data federation can achieve collaborative queries while preserving the privacy of data silos, we present in this paper a secure multi-party relational data federation system based on the idea of federated computation that ``data stays, computation moves.'' The system is compatible with a variety of relational databases and can shield users from the heterogeneity of the underlying data from multiple data owners. On the basis of secret sharing, the system implements the secure multi-party operator library supporting the secure multi-party basic operations, and the resulting reconstruction process of operators is optimized with higher execution efficiency. On this basis, the system supports query operations such as Summation (SUM), Averaging (AVG), Minimization/Maximization (MIN/MAX), equi-join, and $\theta $-join and makes full use of multi-party features to reduce data interactions among data owners and security overhead, thus effectively supporting efficient data sharing. Finally, experiments are conducted on the benchmark dataset TPC-H. The experimental results show that the system can support more data owners than the current data federation systems SMCQL and Conclave and has higher execution efficiency in a variety of query operations, exceeding the existing systems by as much as 3.75 times. Reference Related Cited by
What problem does this paper attempt to address?