A robust quantile regression for bounded variables based on the Kumaraswamy Rectangular distribution
Nobre, Juvêncio
DOI: https://doi.org/10.1007/s11222-024-10381-0
IF: 2.3241
2024-02-11
Statistics and Computing
Abstract:Quantile regression (QR) models offer an interesting alternative compared with ordinary regression models for the response mean. Besides allowing a more appropriate characterization of the response distribution, the former is less sensitive to outlying observations than the latter. Indeed, the QR models allow modeling other characteristics of the response distribution, such as the lower and/or upper tails. However, in the presence of outlying observations, the estimates can still be affected. In this context, a robust quantile parametric regression model for bounded responses is developed, considering a new distribution, the Kumaraswamy Rectangular (KR) distribution. The KR model corresponds to a finite mixture structure similar to the Beta Rectangular distribution. That is, the KR distribution has heavier tails compared to the Kumaraswamy model. Indeed, we show that the correspondent KR quantile regression model is more robust and flexible than the usual Kumaraswamy one. Bayesian inference, which includes parameter estimation, model fit assessment, model comparison, and influence analysis, is developed through a hybrid-based MCMC approach. Since the quantile of the KR distribution is not analytically tractable, we consider the modeling of the conditional quantile based on a suitable data augmentation scheme. To link both quantiles in terms of a regression structure, a two-step estimation algorithm under a Bayesian approach is proposed to obtain the numerical approximation of the respective posterior distributions of the parameters of the regression structure for the KR quantile. Such an algorithm combines a Markov Chain Monte Carlo algorithm with the Ordinary Least Squares approach. Our proposal showed to be robust against outlying observations related to the response while keeping the estimation process simple without adding too much to the computational complexity. We showed the effectiveness of our estimation method with a simulation study, whereas two other studies showed some benefits of the proposed model in terms of robustness and flexibility. To exemplify the adequacy of our approach, under the presence of outlying observations, we analyzed two data sets regarding socio-economic indicators from Brazil and compared them with alternatives.
statistics & probability,computer science, theory & methods