Hawk: Accurate and Fast Privacy-Preserving Machine Learning Using Secure Lookup Table Computation

Hamza Saleem,Amir Ziashahabi,Muhammad Naveed,Salman Avestimehr
2024-03-26
Abstract:Training machine learning models on data from multiple entities without direct data sharing can unlock applications otherwise hindered by business, legal, or ethical constraints. In this work, we design and implement new privacy-preserving machine learning protocols for logistic regression and neural network models. We adopt a two-server model where data owners secret-share their data between two servers that train and evaluate the model on the joint data. A significant source of inefficiency and inaccuracy in existing methods arises from using Yao's garbled circuits to compute non-linear activation functions. We propose new methods for computing non-linear functions based on secret-shared lookup tables, offering both computational efficiency and improved accuracy.
Cryptography and Security,Machine Learning
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is: when conducting machine - learning model training among multiple entities, how to ensure data privacy without directly sharing data, while improving computational efficiency and model accuracy. Specifically, the article focuses on the following points: 1. **Improving computational efficiency under privacy protection**: Existing privacy - protected machine - learning methods (such as using Yao's garbled circuits to calculate non - linear activation functions) have problems of low efficiency and poor accuracy. This paper proposes a new method based on secret - sharing look - up tables to calculate non - linear functions, thereby improving computational efficiency and model accuracy. 2. **Introducing a relaxed security model**: In order to further optimize performance, the author proposes a slightly relaxed security model, which allows the server to disclose limited information when accessing the look - up table mode, but ensures that this information leakage will not damage the overall security and complies with the šœ– - š‘‘X - privacy standard. Through this method, the computational resources required for training can be significantly reduced. 3. **Developing efficient PPML protocols**: Based on the above methods, the paper designs and implements two new privacy - protected machine - learning protocols - Hawk Single and Hawk Multi, for the training of logistic regression and neural network models. These protocols can not only accurately calculate activation functions and their derivatives, but also support the reuse of look - up tables, thereby reducing the need for a large number of look - up tables. In summary, this paper aims to greatly improve the speed and accuracy of multi - entity collaborative training of machine - learning models while ensuring data privacy through innovative technical means.