Adaptive Noisy Data Augmentation for Regularized Estimation and Inference of Generalized Linear Models

Yinan Li,Fang Liu
DOI: https://doi.org/10.1109/compsac54236.2022.00051
2022-01-01
Abstract:We propose the AdaPtive Noise Augmentation (PANDA) procedure to regularize the estimation and inference of generalized linear models (GLMs). PANDA iteratively optimizes the objective function given noise-augmented data to obtain regularized model estimates. The augmented noises are designed to achieve various regularization effects, including l0, bridge (lasso and ridge included), elastic net, adaptive lasso, and SCAD, as well as group lasso and fused ridge. We examine the tail bound of the noise-augmented loss function and establish the almost sure convergence of the noise-augmented loss function and its minimizer to the expected penalized loss function and its minimizer, respectively. PANDA exhibits ensemble learning behaviors that help further decrease the generalization error of trained GLMs. We also derive the asymptotic distributions for PANDA-regularized parameters, based on which, inferences can be obtained for GLM parameters. Computationally, PANDA is easy to code and can leverage existing software for implementing unregularized GLMs. We demonstrate the superior or similar performance of PANDA against existing approaches that offer the same type of regularizers in simulated and real-life data. We show that inferences through PANDA achieve nominal or near-nominal coverage and are far more efficient compared to a popular existing post-selection procedure.
What problem does this paper attempt to address?