Abstract:In this dissertation we study statistical and online learning problems from an optimization <a class="link-external link-http" href="http://viewpoint.The" rel="external noopener nofollow">this http URL</a> dissertation is divided into two parts : I. We first consider the question of learnability for statistical learning problems in the general learning setting. The question of learnability is well studied and fully characterized for binary classification and for real valued supervised learning problems using the theory of uniform convergence. However we show that for the general learning setting uniform convergence theory fails to characterize learnability. To fill this void we use stability of learning algorithms to fully characterize statistical learnability in the general setting. Next we consider the problem of online learning. Unlike the statistical learning framework there is a dearth of generic tools that can be used to establish learnability and rates for online learning problems in general. We provide online analogs to classical tools from statistical learning theory like Rademacher complexity, covering numbers, etc. We further use these tools to fully characterize learnability for online supervised learning problems. II. In the second part, for general classes of convex learning problems, we provide appropriate mirror descent (MD) updates for online and statistical learning of these problems. Further, we show that the the MD is near optimal for online convex learning and for most cases, is also near optimal for statistical convex learning. We next consider the problem of convex optimization and show that oracle complexity can be lower bounded by the so called fat-shattering dimension of the associated linear class. Thus we establish a strong connection between offline convex optimization problems and statistical learning problems. We also show that for a large class of high dimensional optimization problems, MD is in fact near optimal even for convex optimization.

Geometry, Computation, and Optimality in Stochastic Optimization

Asymptotic Optimality in Stochastic Optimization

Local Asymptotics for some Stochastic Optimization Problems: Optimality, Constraint Identification, and Dual Averaging

Methods for Optimization Problems with Markovian Stochasticity and Non-Euclidean Geometry

Beyond Convexity: Stochastic Quasi-Convex Optimization

Stochastic Successive Convex Approximation for Non-Convex Constrained Stochastic Optimization

Simple and Optimal Stochastic Gradient Methods for Nonsmooth Nonconvex Optimization

The Impact of Local Geometry and Batch Size on Stochastic Gradient Descent for Nonconvex Problems

Learning From An Optimization Viewpoint

Stochastic Methods for Composite Optimization Problems

On nonconvex optimization for machine learning: Gradients, stochasticity, and saddle points

STOCHASTIC METHODS FOR COMPOSITE AND WEAKLY CONVEX OPTIMIZATION PROBLEMS

A Linearly Convergent Conditional Gradient Algorithm with Applications to Online and Stochastic Optimization

Adaptive Stochastic Optimisation of Nonconvex Composite Objectives

Accelerated stochastic approximation with state-dependent noise

Stochastic Optimization with Decision-Dependent Distributions

Stochastic gradient descent for non-smooth optimization: Convergence results and optimal averaging schemes

Online Statistical Inference for Gradient-free Stochastic Optimization

Beyond Minimax Optimality: A Subgame Perfect Gradient Method

Optimization Algorithms for Faster Computational Geometry.

Directional Smoothness and Gradient Methods: Convergence and Adaptivity