Abstract:We study private empirical risk minimization (ERM) problem for losses satisfying the $(\gamma,\kappa)$-Kurdyka-Łojasiewicz (KL) condition. The Polyak-Łojasiewicz (PL) condition is a special case of this condition when $\kappa=2$. Specifically, we study this problem under the constraint of $\rho$ zero-concentrated differential privacy (zCDP). When $\kappa\in[1,2]$ and the loss function is Lipschitz and smooth over a sufficiently large region, we provide a new algorithm based on variance reduced gradient descent that achieves the rate $\tilde{O}\big(\big(\frac{\sqrt{d}}{n\sqrt{\rho}}\big)^\kappa\big)$ on the excess empirical risk, where $n$ is the dataset size and $d$ is the dimension. We further show that this rate is nearly optimal. When $\kappa \geq 2$ and the loss is instead Lipschitz and weakly convex, we show it is possible to achieve the rate $\tilde{O}\big(\big(\frac{\sqrt{d}}{n\sqrt{\rho}}\big)^\kappa\big)$ with a private implementation of the proximal point method. When the KL parameters are unknown, we provide a novel modification and analysis of the noisy gradient descent algorithm and show that this algorithm achieves a rate of $\tilde{O}\big(\big(\frac{\sqrt{d}}{n\sqrt{\rho}}\big)^{\frac{2\kappa}{4-\kappa}}\big)$ adaptively, which is nearly optimal when $\kappa = 2$. We further show that, without assuming the KL condition, the same gradient descent algorithm can achieve fast convergence to a stationary point when the gradient stays sufficiently large during the run of the algorithm. Specifically, we show that this algorithm can approximate stationary points of Lipschitz, smooth (and possibly nonconvex) objectives with rate as fast as $\tilde{O}\big(\frac{\sqrt{d}}{n\sqrt{\rho}}\big)$ and never worse than $\tilde{O}\big(\big(\frac{\sqrt{d}}{n\sqrt{\rho}}\big)^{1/2}\big)$. The latter rate matches the best known rate for methods that do not rely on variance reduction.

Asymptotically optimal private estimation under mean square loss

Instance-Optimal Differentially Private Estimation

Mean Estimation Under Heterogeneous Privacy Demands

Instance-optimal Mean Estimation Under Differential Privacy

Mean estimation in the add-remove model of differential privacy

Private Mean Estimation with Person-Level Differential Privacy

Geometrizing rates of convergence under local differential privacy constraints

Minimax Rates of Estimating Approximate Differential Privacy

Subset-Based Instance Optimality in Private Estimation

Pointwise adaptive kernel density estimation under local approximate differential privacy

Insufficient Statistics Perturbation: Stable Estimators for Private Least Squares

Mean Estimation Under Heterogeneous Privacy: Some Privacy Can Be Free

Differentially Private Non-Convex Optimization under the KL Condition with Optimal Rates

Better and Simpler Lower Bounds for Differentially Private Statistical Estimation

The Cost of Privacy: Optimal Rates of Convergence for Parameter Estimation with Differential Privacy

Adaptive pointwise density estimation under local differential privacy

Dimension-free Private Mean Estimation for Anisotropic Distributions

On Differentially Private U Statistics

On the Statistical Complexity of Estimation and Testing under Privacy Constraints

PLAN: Variance-Aware Private Mean Estimation

Efficiency in local differential privacy