Abstract:Learning from large-scale and high-dimensional data still remains a computationally challenging problem, though it has received increasing interest recently. To address this issue, randomized reduction methods have been developed by either reducing the dimensionality or reducing the number of training instances to obtain a small sketch of the original data. In this paper, we focus on recovering a high-dimensional classification/regression model from random sketched data. We propose to exploit the intrinsic sparsity of optimal solutions and develop novel methods by increasing the regularization parameter before the sparse regularizer. In particular, (i) for high-dimensional classification problems, we leverage randomized reduction methods to reduce the dimensionality of data and solve a dual formulation on the random sketched data with an introduced sparse regularizer on the dual solution; (ii) for high-dimensional sparse least-squares regression problems, we employ randomized reduction methods to reduce the scale of data and solve a formulation on the random sketched data with an increased regularization parameter before the sparse regularizer. For both classes of problems, by exploiting the intrinsic sparsity of the optimal dual solution or the optimal primal solution we provide formal theoretical guarantee on the recovery error of learned models in comparison with the optimal models that are learned from the original data. Compared with previous studies on randomized reduction for machine learning, the present work enjoy several advantages: (i) the proposed formulations enjoys intuitive geometric explanations; (ii) the theoretical guarantee does not rely on any stringent assumptions about the original data (e.g., low-rankness of the data matrix or the data are linearly separable); (iii) the theory covers both smooth and non-smooth loss functions for classification; (iv) the analysis is applicable to a broad class of randomized reduction methods as long as the reduction matrices admit the Johnson–Lindenstrauss type of lemma. We also present empirical studies to support the proposed methods and the presented theory.

High-Dimensional Analysis for Generalized Nonlinear Regression: from Asymptotics to Algorithm

Sparse deep neural networks for nonparametric estimation in high-dimensional sparse regression

Nonasymptotic theory for two-layer neural networks: Beyond the bias-variance trade-off

High-Dimensional Linear Regression via Implicit Regularization

Effective Minkowski Dimension of Deep Nonparametric Regression: Function Approximation and Statistical Theories

Deep Nonlinear Sufficient Dimension Reduction

Scaling and renormalization in high-dimensional regression

Generalization Error of Generalized Linear Models in High Dimensions

Dimension free ridge regression

High-dimensional analysis of double descent for linear regression with random projections

High-dimensional Model Recovery from Random Sketched Data by Exploring Intrinsic Sparsity

The generalization error of random features regression: Precise asymptotics and double descent curve

A Duality Framework for Generalization Analysis of Random Feature Models and Two-Layer Neural Networks.

Computationally Efficient and Statistically Optimal Robust High-Dimensional Linear Regression

Sparse Nonlinear Regression: Parameter Estimation and Asymptotic Inference

On Ridge Estimation in High-dimensional Rotationally Sparse Linear Regression

Meta-Learning with Generalized Ridge Regression: High-dimensional Asymptotics, Optimality and Hyper-covariance Estimation

Nonparametric regression using over-parameterized shallow ReLU neural networks

Conditional regression for the Nonlinear Single-Variable Model

Sub-optimality of the Naive Mean Field approximation for proportional high-dimensional Linear Regression

Low dimensional approximation and generalization of multivariate functions on smooth manifolds using deep ReLU neural networks