Two-Phase Multi-Party Computation Enabled Privacy-Preserving Federated Learning

Renuga Kanagavelu,Zengxiang Li,Juniarto Samsudin,Yechao Yang,Feng Yang,Rick Siow Mong Goh,Mervyn Cheah,Praewpiraya Wiwatphonthana,Khajonpong Akkarajitsakul,Shangguang Wangz
DOI: https://doi.org/10.48550/arXiv.2005.11901
2020-05-25
Distributed, Parallel, and Cluster Computing
Abstract:Countries across the globe have been pushing strict regulations on the protection of personal or private data collected. The traditional centralized machine learning method, where data is collected from end-users or IoT devices, so that it can discover insights behind real-world data, may not be feasible for many data-driven industry applications in light of such regulations. A new machine learning method, coined by Google as Federated Learning (FL) enables multiple participants to train a machine learning model collectively without directly exchanging data. However, recent studies have shown that there is still a possibility to exploit the shared models to extract personal or confidential data. In this paper, we propose to adopt Multi Party Computation (MPC) to achieve privacy-preserving model aggregation for FL. The MPC-enabled model aggregation in a peer-to-peer manner incurs high communication overhead with low scalability. To address this problem, the authors proposed to develop a two-phase mechanism by 1) electing a small committee and 2) providing MPC-enabled model aggregation service to a larger number of participants through the committee. The MPC enabled FL framework has been integrated in an IoT platform for smart manufacturing. It enables a set of companies to train high quality models collectively by leveraging their complementary data-sets on their own premises, without compromising privacy, model accuracy vis-a-vis traditional machine learning methods and execution efficiency in terms of communication cost and execution time.
What problem does this paper attempt to address?