Differentially Private Linear Regression Analysis via Truncating Technique
Yifei Liu,Ning Wang,Zhigang Wang,Xiaodong Wang,Yun Gao,Xiaopeng Ji,Zhiqiang Wei,Jun Qiao
DOI: https://doi.org/10.1007/978-3-030-87571-8_22
2021-01-01
Abstract:This paper discusses how to study the linear regression model accurately while guaranteeing epsilon-differential privacy. The parameters involved in linear regression are sensitive to one single record in database. As a result, a large scale of noise has to be added into the parameters to protect the records in database, which leads to inaccurate results. To improve the accuracy of published results, the existing works enforce epsilon-differential privacy by perturbing the coefficients in the objective function(loss function) of one optimization problem, which is constructed to derive parameters of linear regression, rather than adding noise to the parameters directly. And the scale of noise generated in the above technique is proportional to the square of dimensionality. Obviously, if the dimensionality is high, the scale of noise will be very large, i.e., curse of dimensionality. To settle this issue, this paper firstly studies a truncating length in a differential private way, where the length limits the maximal influence of one record on the coefficients of objective function. And then the noisy truncating coefficients are published with the truncating length limitation. Finally, the parameters involved in linear regression can be derived based on the objective function with noisy coefficients. The experiments on real datasets validate the effectiveness of our proposals.