Abstract:Learning from large-scale and high-dimensional data still remains a computationally challenging problem, though it has received increasing interest recently. To address this issue, randomized reduction methods have been developed by either reducing the dimensionality or reducing the number of training instances to obtain a small sketch of the original data. In this paper, we focus on recovering a high-dimensional classification/regression model from random sketched data. We propose to exploit the intrinsic sparsity of optimal solutions and develop novel methods by increasing the regularization parameter before the sparse regularizer. In particular, (i) for high-dimensional classification problems, we leverage randomized reduction methods to reduce the dimensionality of data and solve a dual formulation on the random sketched data with an introduced sparse regularizer on the dual solution; (ii) for high-dimensional sparse least-squares regression problems, we employ randomized reduction methods to reduce the scale of data and solve a formulation on the random sketched data with an increased regularization parameter before the sparse regularizer. For both classes of problems, by exploiting the intrinsic sparsity of the optimal dual solution or the optimal primal solution we provide formal theoretical guarantee on the recovery error of learned models in comparison with the optimal models that are learned from the original data. Compared with previous studies on randomized reduction for machine learning, the present work enjoy several advantages: (i) the proposed formulations enjoys intuitive geometric explanations; (ii) the theoretical guarantee does not rely on any stringent assumptions about the original data (e.g., low-rankness of the data matrix or the data are linearly separable); (iii) the theory covers both smooth and non-smooth loss functions for classification; (iv) the analysis is applicable to a broad class of randomized reduction methods as long as the reduction matrices admit the Johnson–Lindenstrauss type of lemma. We also present empirical studies to support the proposed methods and the presented theory.

High-Dimensional Linear Regression via Implicit Regularization

Inference for High-Dimensional Linear Expectile Regression with De-Biasing Method

Understanding Implicit Regularization in Over-Parameterized Single Index Model

Implicit Regularization in Nonconvex Statistical Estimation: Gradient Descent Converges Linearly for Phase Retrieval, Matrix Completion, and Blind Deconvolution

Sparse Estimation Via ℓ_q Optimization Method in High-Dimensional Linear Regression

High-dimensional Inference Via Lipschitz Sparsity-Yielding Regularizers.

Implicit Regularization in Deep Matrix Factorization

Implicit Sparse Regularization: The Impact of Depth and Early Stopping

The Role of Fine-tuning: Transfer Learning for High-dimensional M-estimators with Decomposable Regularizers

Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression

Robust Implicit Regularization via Weight Normalization

Sparse Estimation Via Lower-Order Penalty Optimization Methods in High-Dimensional Linear Regression.

Nonparametric regression using over-parameterized shallow ReLU neural networks

A Dynamics Theory of Implicit Regularization in Deep Low-Rank Matrix Factorization

A Constructive Approach to High-dimensional Regression

Smoothing the Edges: Smooth Optimization for Sparse Regularization using Hadamard Overparametrization

Regularization Methods for High-Dimensional Instrumental Variables Regression With an Application to Genetical Genomics

Estimation of Linear Functionals in High-Dimensional Linear Models: From Sparsity to Nonsparsity

An RKHS-based approach to double-penalized regression in high-dimensional partially linear models

A Unified Dynamic Approach to Sparse Model Selection

High-dimensional Model Recovery from Random Sketched Data by Exploring Intrinsic Sparsity