Small area estimation with generalized random forests: Estimating poverty rates in Mexico

Nicolas Frink,Timo Schmid
DOI: https://doi.org/10.48550/arXiv.2406.03861
2024-06-06
Abstract:Identifying and addressing poverty is challenging in administrative units with limited information on income distribution and well-being. To overcome this obstacle, small area estimation methods have been developed to provide reliable and efficient estimators at disaggregated levels, enabling informed decision-making by policymakers despite the data scarcity. From a theoretical perspective, we propose a robust and flexible approach for estimating poverty indicators based on binary response variables within the small area estimation context: the generalized mixed effects random forest. Our method employs machine learning techniques to identify predictive, non-linear relationships from data, while also modeling hierarchical structures. Mean squared error estimation is explored using a parametric bootstrap. From an applied perspective, we examine the impact of information loss due to converting continuous variables into binary variables on the performance of small area estimation methods. We evaluate the proposed point and uncertainty estimates in both model- and design-based simulations. Finally, we apply our method to a case study revealing spatial patterns of poverty in the Mexican state of Tlaxcala.
Methodology
What problem does this paper attempt to address?