Abstract:We study the power of query access for the task of agnostic learning under the Gaussian distribution. In the agnostic model, no assumptions are made on the labels and the goal is to compute a hypothesis that is competitive with the {\em best-fit} function in a known class, i.e., it achieves error $\mathrm{opt}+\epsilon$, where $\mathrm{opt}$ is the error of the best function in the class. We focus on a general family of Multi-Index Models (MIMs), which are $d$-variate functions that depend only on few relevant directions, i.e., have the form $g(\mathbf{W} \mathbf{x})$ for an unknown link function $g$ and a $k \times d$ matrix $\mathbf{W}$. Multi-index models cover a wide range of commonly studied function classes, including constant-depth neural networks with ReLU activations, and intersections of halfspaces. Our main result shows that query access gives significant runtime improvements over random examples for agnostically learning MIMs. Under standard regularity assumptions for the link function (namely, bounded variation or surface area), we give an agnostic query learner for MIMs with complexity $O(k)^{\mathrm{poly}(1/\epsilon)} \; \mathrm{poly}(d) $. In contrast, algorithms that rely only on random examples inherently require $d^{\mathrm{poly}(1/\epsilon)}$ samples and runtime, even for the basic problem of agnostically learning a single ReLU or a halfspace. Our algorithmic result establishes a strong computational separation between the agnostic PAC and the agnostic PAC+Query models under the Gaussian distribution. Prior to our work, no such separation was known -- even for the special case of agnostically learning a single halfspace, for which it was an open problem first posed by Feldman. Our results are enabled by a general dimension-reduction technique that leverages query access to estimate gradients of (a smoothed version of) the underlying label function.

Omnipredicting Single-Index Models with Multi-Index Models

Agnostically Learning Single-Index Models using Omnipredictors

Adversarial Prediction Games for Multivariate Losses

Sample and Computationally Efficient Robust Learning of Gaussian Single-Index Models

Robustly Learning Single-Index Models via Alignment Sharpness

Observable adjustments in single-index models for regularized M-estimators

Agnostic Active Learning of Single Index Models with Linear Sample Complexity

Omnipredictors for Regression and the Approximate Rank of Convex Functions

Surrogate Aided Unsupervised Recovery of Sparse Signals in Single Index Models for Binary Outcomes

From Fairness to Infinity: Outcome-Indistinguishable (Omni)Prediction in Evolving Graphs

Indexing Cost Sensitive Prediction

Predictive Multiplicity in Probabilistic Classification

Agnostically Learning Multi-index Models with Queries

Simultaneous semiparametric inference for single-index models

Siamese Survival Analysis with Competing Risks

Nonlinear generalization of the monotone single index model

A Consistent and Scalable Algorithm for Best Subset Selection in Single Index Models

Online Algorithms with Multiple Predictions

Mixing predictions for online metric algorithms

Symmetric Single Index Learning

Leading strategies in competitive on-line prediction