Abstract:Several statistical models for regression of a function $F$ on $\mathbb{R}^d$ without the statistical and computational curse of dimensionality exist, for example by imposing and exploiting geometric assumptions on the distribution of the data (e.g. that its support is low-dimensional), or strong smoothness assumptions on $F$, or a special structure $F$. Among the latter, compositional models assume $F=f\circ g$ with $g$ mapping to $\mathbb{R}^r$ with $r\ll d$, have been studied, and include classical single- and multi-index models and recent works on neural networks. While the case where $g$ is linear is rather well-understood, much less is known when $g$ is nonlinear, and in particular for which $g$'s the curse of dimensionality in estimating $F$, or both $f$ and $g$, may be circumvented. In this paper, we consider a model $F(X):=f(\Pi_\gamma X) $ where $\Pi_\gamma:\mathbb{R}^d\to[0,\rm{len}_\gamma]$ is the closest-point projection onto the parameter of a regular curve $\gamma: [0,\rm{len}_\gamma]\to\mathbb{R}^d$ and $f:[0,\rm{len}_\gamma]\to\mathbb{R}^1$. The input data $X$ is not low-dimensional, far from $\gamma$, conditioned on $\Pi_\gamma(X)$ being well-defined. The distribution of the data, $\gamma$ and $f$ are unknown. This model is a natural nonlinear generalization of the single-index model, which corresponds to $\gamma$ being a line. We propose a nonparametric estimator, based on conditional regression, and show that under suitable assumptions, the strongest of which being that $f$ is coarsely monotone, it can achieve the $one$-$dimensional$ optimal min-max rate for non-parametric regression, up to the level of noise in the observations, and be constructed in time $\mathcal{O}(d^2n\log n)$. All the constants in the learning bounds, in the minimal number of samples required for our bounds to hold, and in the computational complexity are at most low-order polynomials in $d$.

What problem does this paper attempt to address?

The core problem that this paper attempts to solve is: in the Nonlinear Single - Variable Model, how to estimate the regression function $F$ through the conditional regression method and avoid the curse of dimensionality. Specifically: 1. **Problem Background**: - When performing non - parametric regression in high - dimensional space, the curse of dimensionality problem is usually encountered. That is, as the dimension of the input data increases, the learning rate deteriorates sharply. - To solve this problem, many research works attempt to introduce various assumptions or structured models, such as the low - dimensional manifold assumption, the strong smoothness assumption, or special function structures. 2. **Problems Proposed in the Paper**: - The paper focuses on the nonlinear single - variable model $F(X): = f(\Pi_\gamma X)$, where $\Pi_\gamma:\mathbb{R}^d\rightarrow[0,\text{len}\gamma]$ is the nearest - point projection onto a regular curve $\gamma:[0,\text{len}\gamma]\rightarrow\mathbb{R}^d$, and $f:[0,\text{len}\gamma]\rightarrow\mathbb{R}$ is a univariate function. - The input data $X$ is not low - dimensional and is far from the curve $\gamma$, but conditional on $\Pi_\gamma(X)$ is well - defined. - The distribution of the data, the curve $\gamma$, and the function $f$ are all unknown. 3. **Objectives**: - Propose a non - parametric estimator based on conditional regression, which can achieve the one - dimensional optimal non - parametric regression minimax convergence rate under appropriate assumptions (such as $f$ being roughly monotonic), without being affected by the dimension of the input data. - The time complexity of this estimator is $O(d^2n\log n)$, and all constant terms are low - order polynomials with respect to the dimension $d$. 4. **Main Contributions**: - Introduce a new nonlinear single - variable model, which is between the semi - parametric single - index model and the non - parametric function - composition model. The inner and outer functions $f$ and $g$ are both nonlinear but have a specific geometric structure. - Construct an effective estimator that can overcome the curse of dimensionality and achieve the same optimal learning rate as one - dimensional non - parametric regression. - Provide an efficient algorithm with near - linear time complexity to construct this estimator. In summary, this paper aims to solve the curse of dimensionality problem in non - parametric regression in high - dimensional space by introducing a new nonlinear single - variable model and the conditional regression method, while providing theoretical guarantees and efficient computational methods.

Conditional regression for the Nonlinear Single-Variable Model

Nonlinear generalization of the monotone single index model

A Functional Model Method for Nonconvex Nonsmooth Conditional Stochastic Optimization

A UNIFIED STUDY OF NONPARAMETRIC INFERENCE FOR MONOTONE FUNCTIONS

Estimating Stochastic Linear Combination of Non-Linear Regressions Efficiently and Scalably

Estimating Stochastic Linear Combination of Non-Linear Regressions

Effective Minkowski Dimension of Deep Nonparametric Regression: Function Approximation and Statistical Theories

Nonparametric Functional Analysis of Generalized Linear Models Under Nonlinear Constraints

Finite sample inference in nonlinear regression estimation

Variable Selection and Minimax Prediction in High-dimensional Functional Linear Model

Covariate-adjusted nonlinear regression

Computationally Efficient and Statistically Optimal Robust High-Dimensional Linear Regression

Nonparametric Estimation via Partial Derivatives

Understanding Implicit Regularization in Over-Parameterized Single Index Model

Sub-optimality of the Naive Mean Field approximation for proportional high-dimensional Linear Regression

Model Function Based Conditional Gradient Method with Armijo-like Line Search

Adaptive Inference in Multivariate Nonparametric Regression Models Under Monotonicity

A Non-linear Function-on-Function Model for Regression with Time Series Data

Nonlinear global Fréchet regression for random objects via weak conditional expectation