Transforming response values in small area prediction

Shonosuke Sugasawa,Tatsuya Kubokawa
DOI: https://doi.org/10.48550/arXiv.1509.03951
2017-03-30
Abstract:In real applications of small area estimation, one often encounters data with positive response values. The use of a parametric transformation for positive response values in the Fay-Herriot model is proposed for such a case. An asymptotically unbiased small area predictor is derived and a second-order unbiased estimator of the mean squared error is established using the parametric bootstrap. Through simulation studies, a finite sample performance of the proposed predictor and the MSE estimator is investigated. The methodology is also successfully applied to Japanese survey data.
Methodology
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in Small Area Estimation (SAE), when the response values are positive and the distribution is skewed, the traditional Fay - Herriot model is no longer applicable. Specifically, the author proposes to use parameter transformation (such as Dual Power Transformation, DPT) to process data sets with positive response values in order to improve prediction accuracy and reduce bias. The main contributions of the paper are as follows: 1. **Propose an improved small - area estimation method**: By introducing the parameterized Dual Power Transformation (DPT), the traditional Fay - Herriot model is improved, enabling it to better handle data sets with positive response values. This method not only solves the problem of skewed data distribution but also improves the accuracy and stability of prediction. 2. **Derive the optimal predictor and mean - squared - error estimation**: Based on the improved Fay - Herriot model, the author derives the Best Predictor (BP) and the Empirical Best Predictor (EBP), and establishes a second - order unbiased Mean Squared Error (MSE) estimator. These methods have shown good performance in both theoretical and practical applications. 3. **Verify the effectiveness of the method**: Through simulation studies and the application of actual data, the author verifies the effectiveness and superiority of the proposed method. In particular, in the household income and expenditure survey data in Japan, this method shows better prediction results than traditional methods. ### Specific problem description In small - area estimation, the estimates obtained directly from sample surveys often have large variability due to the small sample size. To solve this problem, model - based methods, such as the Fay - Herriot model, are usually adopted. By "borrowing strength" from related areas, the reliability of the estimates is improved. However, the traditional Fay - Herriot model assumes that the response values are normally distributed, which is not suitable when dealing with positive response values (such as income, expenditure, etc.), because these data usually have a skewed distribution and a non - linear relationship. ### Solution 1. **Dual Power Transformation (DPT)**: The author proposes to use the Dual Power Transformation to process data sets with positive response values. The Dual Power Transformation is a parameterized transformation method whose range is the entire real number axis and will not have the truncation problem like the Box - Cox transformation. The specific form is as follows: \[ h_\lambda(x) = \begin{cases} (2\lambda)^{-1}(x^\lambda - x^{-\lambda}) & \text{if } \lambda > 0 \\ \log x & \text{if } \lambda = 0 \end{cases} \] 2. **Improved Fay - Herriot model**: Apply the Dual Power Transformation to the Fay - Herriot model to obtain an improved model: \[ h_\lambda(y_i) = x_i^t \beta + v_i + \epsilon_i, \quad i = 1, \ldots, m \] where \( v_i \sim N(0, A) \) and \( \epsilon_i \sim N(0, D_i) \) 3. **Optimal predictor and MSE estimation**: Based on the improved model, derive the optimal predictor \( \tilde{\mu}_i \) and the empirical best predictor \( \hat{\mu}_i \), and establish a second - order unbiased MSE estimator. These methods have shown good performance in both theoretical and practical applications. ### Application and verification 1. **Simulation study**: Through simulation studies, compare the performance of the proposed DPT method with other methods (such as the Fay - Herriot model with logarithmic transformation and the traditional Fay - Herriot model). The results show that the DPT method performs better in most cases, especially in the response