Deductible imputation in administrative medical claims datasets

Betsy Q. Cliff,Julia C. P. Eddelbuettel,Mark K. Meiselbach,Matthew D. Eisenberg
DOI: https://doi.org/10.1111/1475-6773.14278
2024-01-19
Health Services Research
Abstract:Objective To validate imputation methods used to infer plan‐level deductibles and determine which enrollees are in high‐deductible health plans (HDHPs) in administrative claims datasets. Data Sources and Study Setting 2017 medical and pharmaceutical claims from OptumLabs Data Warehouse for US individuals <65 continuously enrolled in an employer‐sponsored plan. Data include enrollee and plan characteristics, deductible spending, plan spending, and actual plan‐level deductibles. Study Design We impute plan deductibles using four methods: (1) parametric prediction using individual‐level spending; (2) parametric prediction with imputation and plan characteristics; (3) highest plan‐specific mode of individual annual deductible spending; and (4) deductible spending at the 80th percentile among individuals meeting their deductible. We compare deductibles' levels and categories for imputed versus actual deductibles. Data Collection/Extraction Methods Not applicable. Principal Findings All methods had a positive predictive value (PPV) for determining high‐ versus low‐deductible plans of ≥87%; negative predictive values (NPV) were lower. The method imputing plan‐specific deductible spending modes was most accurate and least computationally intensive (PPV: 95%; NPV: 91%). This method also best correlated with actual deductible levels; 69% of imputed deductibles were within $250 of the true deductible. Conclusions In the absence of plan structure data, imputing plan‐specific modes of individual annual deductible spending best correlates with true deductibles and best predicts enrollees in HDHPs.
health care sciences & services,health policy & services
What problem does this paper attempt to address?