Stochastically Controlled Compositional Gradient for Composition Problems
Liu Liu,Ji Liu,Cho-Jui Hsieh,Dacheng Tao
DOI: https://doi.org/10.1109/tnnls.2021.3098222
IF: 14.255
2021-01-01
IEEE Transactions on Neural Networks and Learning Systems
Abstract:We consider composition problems of the form (1/n)∑<sub>i = 1</sub><sup>n</sup> F<sub>i</sub> (1/m)∑<sub>j = 1</sub><sup>m</sup> G<sub>j</sub>(x) , which are important for machine learning. Although gradient descent and stochastic gradient descent are straightforward solutions, the essential computation of G (x) = (1/m)∑<sub>j = 1</sub><sup>m</sup>G<sub>j</sub>(x) in each single iteration is expensive, let alone for large m . In this article, we devise a stochastically controlled compositional gradient algorithm. Specifically, we introduce two variants of stochastically controlled technique to estimate the inner function G(x) and the gradient of the objective function, respectively. The computational cost is largely reduced. However, the natural needs of two stochastic subsets D<sub>1</sub> and D<sub>2</sub> form direct barriers to guarantee the convergence of the algorithm, especially the theoretical proof of the convergence. To this end, we present a general convergence analysis by proving | D<sub>1</sub>|=min{1/ϵ,m} and | D<sub>2</sub>|=min{1/ϵ,n } , through which the proposed method significantly improve composition algorithms under low target accuracy (i.e., 1/ϵ << m or n ) in both strongly convex and nonconvex settings. Comprehensive experiments demonstrate the superiority of the proposed method over existing methods.
computer science, artificial intelligence, theory & methods,engineering, electrical & electronic, hardware & architecture