Abstract:U-statistics represent a fundamental class of statistics arising from modeling quantities of interest defined by multi-subject responses. U-statistics generalize the empirical mean of a random variable X to sums over every m-tuple of distinct observations of X. Stute (Ann Probab 19(2):812–825, 1991) introduced a class of so-called conditional U-statistics, which may be viewed as a generalization of the Nadaraya–Watson estimates of a regression function. Stute proved their strong pointwise consistency to: r(m)(φ,t):=E[φ(Y1,…,Ym)|(X1,…,Xm)=t],fort∈Rdm.\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\begin{aligned} r^{(m)}(\varphi ,\mathbf { t}):=\mathbb {E}[\varphi (Y_{1},\ldots ,Y_{m})|(X_{1},\ldots ,X_{m})=\mathbf {t}], ~~\text{ for }~~\mathbf { t}\in \mathbb {R}^{dm}. \end{aligned}$$\end{document}In the present paper, we introduce the k nearest neighborhoods estimator of the conditional U-statistics depending on an infinite-dimensional covariate. A sharp uniform in the number of neighborhoods (UINN) limit law for the proposed estimator is presented. Such result allows the NN to vary within a complete range for which the estimator is consistent. Consequently, it represents an interesting guideline in practice to select the optimal NN in the nonparametric functional data analysis. In addition, uniform consistency is also established over φ∈F$$\varphi \in \mathscr {F}$$ for a suitably restricted class F$$\mathscr {F}$$, in both cases bounded and unbounded, satisfying some moment conditions and some mild conditions on the model. This paper unifies the approaches in some other recent papers. As a by-product of our proofs, we state consistency results for the k-NN conditional U-statistics, under the random censoring, which are uniform in the number of neighbors. The theoretical uniform consistency results, established in this paper, are (or will be) key tools for many further developments in functional data analysis.

Optimal Nonparametric Inference with Two-Scale Distributional Nearest Neighbors

Distributed Bootstrap Simultaneous Inference for High-Dimensional Quantile Regression

Distributionally Robust Weighted $k$-Nearest Neighbors

Two-dimensional Nearest Neighbor Discriminant Analysis

Scalable Subsampling Inference for Deep Neural Networks

Extrapolation Towards Imaginary $0$-Nearest Neighbour and Its Improved Convergence Rate

Deep Learning meets Nonparametric Regression: Are Weight-Decayed DNNs Locally Adaptive?

Under-bagging Nearest Neighbors for Imbalanced Classification

Two-Sample Inference in Highly Dispersed Negative Binomial Models

A Kernel-Based Conditional Two-Sample Test Using Nearest Neighbors (with Applications to Calibration, Regression Curves, and Simulation-Based Inference)

Bagging Nearest-Neighbor Prediction Independence Test: an Efficient Method for Nonlinear Dependence of Two Continuous Variables

NDOD: an Efficient Neighboring Dependent Outlier Detector for Bias Distributed Large Datasets

Distributed Semi-Supervised Sparse Statistical Inference

Surprisal Driven $k$-NN for Robust and Interpretable Nonparametric Learning

Bayes-Decisive Linear KNN with Adaptive Nearest Neighbors

On adaptivity and minimax optimality of two-sided nearest neighbors

Uniform consistency and uniform in number of neighbors consistency for nonparametric regression estimates and conditional U-statistics involving functional data

Confidence Intervals Based on Survey Data with Nearest Neighbor Imputation

A Bi-level Nonlinear Eigenvector Algorithm for Wasserstein Discriminant Analysis

Nonparametric regression for repeated measurements with deep neural networks

Transfer Learning for Nonparametric Classification: Minimax Rate and Adaptive Classifier