Differential Private Stochastic Optimization with Heavy-tailed Data: Towards Optimal Rates

Puning Zhao,Jiafei Wu,Zhe Liu,Chong Wang,Rongfei Fan,Qingming Li
2024-08-19
Abstract:We study convex optimization problems under differential privacy (DP). With heavy-tailed gradients, existing works achieve suboptimal rates. The main obstacle is that existing gradient estimators have suboptimal tail properties, resulting in a superfluous factor of $d$ in the union bound. In this paper, we explore algorithms achieving optimal rates of DP optimization with heavy-tailed gradients. Our first method is a simple clipping approach. Under bounded $p$-th order moments of gradients, with $n$ samples, it achieves $\tilde{O}(\sqrt{d/n}+\sqrt{d}(\sqrt{d}/n\epsilon)^{1-1/p})$ population risk with $\epsilon\leq 1/\sqrt{d}$. We then propose an iterative updating method, which is more complex but achieves this rate for all $\epsilon\leq 1$. The results significantly improve over existing methods. Such improvement relies on a careful treatment of the tail behavior of gradient estimators. Our results match the minimax lower bound in \cite{kamath2022improved}, indicating that the theoretical limit of stochastic convex optimization under DP is achievable.
Machine Learning,Cryptography and Security,Data Structures and Algorithms
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper primarily investigates the problem of stochastic optimization when dealing with heavy-tailed data under the framework of Differential Privacy (DP). Specifically: 1. **Problems with existing methods**: - Existing methods fail to achieve optimal convergence rates when handling heavy-tailed gradients. - The main obstacle is that current gradient estimators are not ideal in high-probability bounds, leading to unnecessary factors in joint bounds. 2. **Research objectives**: - Explore algorithms that can achieve optimal convergence rates under heavy-tailed data. - Propose two methods: a simple clipping method and an iterative updating method. - These methods can significantly improve existing methods and match theoretical lower bounds, indicating that theoretical limits can be achieved. 3. **Specific contributions**: - Simple clipping method: Satisfies DP requirements by clipping gradients and adding noise, suitable for smaller privacy budgets (\( \epsilon \leq 1/\sqrt{d} \)). - Iterative updating method: Divides data into multiple groups, estimates each group, and iteratively updates to achieve the optimal convergence rate. Through these methods, the paper addresses the problem of achieving optimal DP stochastic optimization under heavy-tailed data.