Abstract:We investigate the in-distribution generalization of machine learning algorithms. We depart from traditional complexity-based approaches by analyzing information-theoretic bounds that quantify the dependence between a learning algorithm and the training data. We consider two categories of generalization guarantees: 1) Guarantees in expectation: These bounds measure performance in the average case. Here, the dependence between the algorithm and the data is often captured by information measures. While these measures offer an intuitive interpretation, they overlook the geometry of the algorithm's hypothesis class. Here, we introduce bounds using the Wasserstein distance to incorporate geometry, and a structured, systematic method to derive bounds capturing the dependence between the algorithm and an individual datum, and between the algorithm and subsets of the training data. 2) PAC-Bayesian guarantees: These bounds measure the performance level with high probability. Here, the dependence between the algorithm and the data is often measured by the relative entropy. We establish connections between the Seeger--Langford and Catoni's bounds, revealing that the former is optimized by the Gibbs posterior. We introduce novel, tighter bounds for various types of loss functions. To achieve this, we introduce a new technique to optimize parameters in probabilistic statements. To study the limitations of these approaches, we present a counter-example where most of the information-theoretic bounds fail while traditional approaches do not. Finally, we explore the relationship between privacy and generalization. We show that algorithms with a bounded maximal leakage generalize. For discrete data, we derive new bounds for differentially private algorithms that guarantee generalization even with a constant privacy parameter, which is in contrast to previous bounds in the literature.

Generalization in multi-objective machine learning

Pareto-Based Multiobjective Machine Learning: an Overview and Case Studies

Pareto-Based Multiobjective Machine Learning

Towards Generalization Beyond Pointwise Learning: A Unified Information-theoretic Perspective

Modeling Generalization in Machine Learning: A Methodological and Computational Study

Generalization Analysis for Game-Theoretic Machine Learning

Generalization Improvement in Multi-Objective Learning

Multi-Task Learning as Multi-Objective Optimization

Rethinking Multi-domain Generalization with A General Learning Objective

Generalization Bounds For Meta-Learning: An Information-Theoretic Analysis

Multi-objective Deep Learning: Taxonomy and Survey of the State of the Art

Generalization Bounds of Surrogate Policies for Combinatorial Optimization Problems

Generalization Error of Generalized Linear Models in High Dimensions

Multi-objective Ensemble Construction , Learning and Evolution

An Information-Theoretic Approach to Generalization Theory

Evolutionary Multiobjective Optimization For Multilabel Learning

Fine-grained Generalization Analysis of Vector-valued Learning

Optimizing fairness tradeoffs in machine learning with multiobjective meta-models

On Generalization Error Bounds of Noisy Gradient Methods for Non-Convex Learning

Pareto Analysis of Evolutionary and Learning Systems