Loopless Semi-Stochastic Gradient Descent with Less Hard Thresholding for Sparse Learning

Xiangyang Liu,Bingkun Wei,Fanhua Shang,Hongying Liu
DOI: https://doi.org/10.1145/3357384.3358021
2019-01-01
Abstract:Stochastic gradient hard thresholding methods have recently been shown to work favorably for solving large-scale empirical risk minimization problems under sparsity constraints. Many stochastic hard thresholding methods (e.g., SVRG-HT) conduct a full gradient update with a constant frequency and perform a hard thresholding operation at each iteration, which leads to a high computational complexity especially for high-dimensional and sparse problems. To be more efficient in large-scale datasets, we propose an efficient single-layer semi-stochastic gradient hard thresholding (LSSG-HT) method. The proposed algorithm updates full gradient with a given probability p and reduces lots of hard thresholding operations by setting frequency m, which reduces hard thresholding complexity in theory to O(κ_s/młog(1/ε)) compared with O(κ_słog(1/ε)) of SVRG-HT. We prove that our algorithm can converge to an optimal solution with a linear convergence rate. Furthermore, we also present an asynchronous parallel variant of LSSG-HT. Numerical experimental results demonstrate that the efficiency of our algorithms with comparison against the state-of-the-art algorithms.
What problem does this paper attempt to address?