Abstract:For many data-intensive real-world applications, such as recognizing objects from images, detecting spam emails, and recommending items on retail websites, the most successful current approaches involve learning rich prediction rules from large datasets. There are many challenges in these machine learning tasks. For example, as the size of the datasets and the complexity of these prediction rules increase, there is a significant challenge in designing scalable methods that can effectively exploit the availability of distributed computing units. As another example, in many machine learning applications, there can be data corruptions, communication errors, and even adversarial attacks during training and test. Therefore, to build reliable machine learning models, we also have to tackle the challenge of robustness in machine learning.In this dissertation, we study several topics on the scalability and robustness in large-scale learning, with a focus of establishing solid theoretical foundations for these problems, and demonstrate recent progress towards the ambitious goal of building more scalable and robust machine learning models. We start with the speedup saturation problem in distributed stochastic gradient descent (SGD) algorithms with large mini-batches. We introduce the notion of gradient diversity, a metric of the dissimilarity between concurrent gradient updates, and show its key role in the convergence and generalization performance of mini-batch SGD. We then move forward to Byzantine distributed learning, a topic that involves both scalability and robustness in distributed learning. In the Byzantine setting that we consider, a fraction of …

Training Efficiency and Robustness in Deep Learning

An Efficient Optimization Technique for Training Deep Neural Networks

Making Robust Generalizers Less Rigid with Soft Ascent-Descent

Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding

Towards More Scalable and Robust Machine Learning

Optimism in the Face of Adversity: Understanding and Improving Deep Learning Through Adversarial Robustness

A Fast Saddle-Point Dynamical System Approach to Robust Deep Learning

Improving the Robustness of Deep Neural Networks via Stability Training

Enhancing 3D Robotic Vision Robustness by Minimizing Adversarial Mutual Information through a Curriculum Training Approach

On the Robustness of Decision-Focused Learning

Failures Are Fated, But Can Be Faded: Characterizing and Mitigating Unwanted Behaviors in Large-Scale Vision and Language Models

Using learned optimizers to make models robust to input noise

$\ell_\infty$-Robustness and Beyond: Unleashing Efficient Adversarial Training

The Effectiveness of Random Forgetting for Robust Generalization

Opportunities and Challenges in Deep Learning Adversarial Robustness: A Survey

Towards Robust Out-of-Distribution Generalization: Data Augmentation and Neural Architecture Search Approaches

An Integrated Approach to Produce Robust Models with High Efficiency

Optimization for deep learning: theory and algorithms

Towards Well-trained Model Robustness in Federated Learning: an Adversarial- Example-Generation- Efficiency Perspective

Benchmarking Robust Self-Supervised Learning Across Diverse Downstream Tasks

Negative Feedback Training: A Novel Concept to Improve Robustness of NVCIM DNN Accelerators