Local Convergence Properties of SAGA/Prox-SVRG and Acceleration

Clarice Poon,Jingwei Liang,Carola‐Bibiane Schönlieb
DOI: https://doi.org/10.48550/arxiv.1802.02554
2018-01-01
Abstract:Over the past ten years, driven by large scale optimisation problems arising from machine learning, the development of stochastic optimisation methods have witnessed a tremendous growth. However, despite their popularity, the theoretical understandings of these methods are quite limited in contrast to the deterministic optimisation methods. In this paper, we present a local convergence analysis for a typical type of stochastic optimisation methods: proximal variance reduced stochastic gradient methods, and mainly focus on the SAGA [12] and Prox-SVRG [43] algorithms. Under the assumption that the non-smooth component of the optimisation problem is partly smooth relative to a smooth manifold, we present a unified framework for the local convergence analysis of the SAGA/Prox-SVRG algorithms: (i) the sequences generated by the SAGA/Prox-SVRG are able to identify the smooth manifold in a finite number of iterations; (ii) then the sequence enters a local linear convergence regime. Beyond local convergence analysis, we also discuss various possibilities for accelerating these algorithms, including adapting to better local parameters, and applying higher-order deterministic/stochastic optimisation methods which can achieve super-linear convergence. Concrete examples arising from machine learning are considered to verify the obtained results.
What problem does this paper attempt to address?