Online Quantile Regression

Yinan Shen,Dong Xia,Wen-Xin Zhou
2024-02-19
Abstract:This paper addresses the challenge of integrating sequentially arriving data within the quantile regression framework, where the number of features is allowed to grow with the number of observations, the horizon is unknown, and memory is limited. We employ stochastic sub-gradient descent to minimize the empirical check loss and study its statistical properties and regret performance. In our analysis, we unveil the delicate interplay between updating iterates based on individual observations versus batches of observations, revealing distinct regularity properties in each scenario. Our method ensures long-term optimal estimation irrespective of the chosen update strategy. Importantly, our contributions go beyond prior works by achieving exponential-type concentration inequalities and attaining optimal regret and error rates that exhibit only \textsf{ short-term} sensitivity to initial errors. A key insight from our study is the delicate statistical analyses and the revelation that appropriate stepsize schemes significantly mitigate the impact of initial errors on subsequent errors and regrets. This underscores the robustness of stochastic sub-gradient descent in handling initial uncertainties, emphasizing its efficacy in scenarios where the sequential arrival of data introduces uncertainties regarding both the horizon and the total number of observations. Additionally, when the initial error rate is well-controlled, there is a trade-off between short-term error rate and long-term optimality. Due to the lack of delicate statistical analysis for squared loss, we also briefly discuss its properties and proper schemes. Extensive simulations support our theoretical findings.
Statistics Theory,Information Theory,Methodology
What problem does this paper attempt to address?
The paper aims to address the problem of online quantile regression. Specifically, it focuses on how to perform quantile regression under limited memory conditions when data arrives sequentially, and the total number of observations and the time frame are unknown. Traditional least squares or robust regression methods cannot handle this situation well because these methods are primarily designed for conditional mean regression, whereas quantile regression can better reveal the relationship changes between features and response variables, thus providing a more comprehensive view of the conditional distribution. To achieve this goal, the authors employ a stochastic sub-gradient descent method to minimize the empirical check loss and study its statistical properties and regret performance. An important contribution of the paper is the proposal of an exponential concentration inequality, achieving optimal regret and error rates, with these results being only short-term sensitive to initial errors. Additionally, the paper explores the differences between updating iterations based on single observations and batch observations, demonstrating different regularity properties. The study also shows that an appropriate step-size scheme can significantly mitigate the impact of initial errors on subsequent errors and regret, further emphasizing the robustness of the stochastic sub-gradient descent method in handling initial uncertainty. Overall, the paper attempts to address some key challenges in online quantile regression, particularly achieving statistical optimality when dealing with heavy-tailed noise, and provides statistical error bounds applicable to any initialization. Through detailed statistical analysis, the paper proposes a new step-size scheme, making the online quantile regression algorithm more reliable and efficient in practical applications.