Abstract:There are various applications, where companies need to decide to which individuals they should best allocate treatment. To support such decisions, uplift models are applied to predict treatment effects on an individual level. Based on the predicted treatment effects, individuals can be ranked and treatment allocation can be prioritized according to this ranking. An implicit assumption, which has not been doubted in the previous uplift modeling literature, is that this treatment prioritization approach tends to bring individuals with high treatment effects to the top and individuals with low treatment effects to the bottom of the ranking. In our research, we show that heteroskedastictity in the training data can cause a bias of the uplift model ranking: individuals with the highest treatment effects can get accumulated in large numbers at the bottom of the ranking. We explain theoretically how heteroskedasticity can bias the ranking of uplift models and show this process in a simulation and on real-world data. We argue that this problem of ranking bias due to heteroskedasticity might occur in many real-world applications and requires modification of the treatment prioritization to achieve an efficient treatment allocation.

What problem does this paper attempt to address?

### Problems Addressed by the Paper The paper explores the impact of heteroskedasticity on uplift modeling. Specifically, it points out that when heteroskedasticity is present in the training data, the ranking of the uplift model may be biased. This bias can lead to individuals with the highest treatment effect being heavily concentrated at the bottom of the ranking, rather than at the expected top. This issue has not been adequately addressed in previous uplift modeling literature. ### Background and Motivation Uplift models are widely used in various scenarios, such as online marketing, telecom customer retention, and internet ad selection. These models support companies in deciding which individuals to allocate treatments to by predicting individual-level treatment effects. It is usually assumed that the ranking of the uplift model will place individuals with high treatment effects at the top and those with low treatment effects at the bottom. However, the paper points out that this assumption may not hold if heteroskedasticity is present in the training data. ### Key Findings 1. **Ranking Bias Caused by Heteroskedasticity**: - Heteroskedasticity refers to the situation where the level of noise in the data varies with the features. The paper shows through theoretical analysis and simulation experiments that heteroskedasticity can cause ranking bias in uplift models. - Specifically, individuals with high treatment effects may be incorrectly placed at the bottom or top of the ranking, rather than in the middle, due to higher uncertainty in their outcomes. 2. **Empirical Validation with Real Data**: - The paper conducts an empirical analysis using the Criteo online marketing dataset. The results show that the Qini curve of the uplift model (T-learner) exhibits a concave break near the position of 1, indicating the presence of ranking bias. - In contrast, the Qini curve of the outcome model is fully concave, indicating a higher treatment priority. 3. **Evidence of Heteroskedasticity's Impact**: - By analyzing different buckets in the test dataset, the paper finds that individuals with high treatment effects also tend to have the highest unexplained outcome variance. - This finding is consistent with the theoretical analysis, further confirming the hypothesis that heteroskedasticity causes ranking bias. ### Conclusion The paper reveals the potential impact of heteroskedasticity on the ranking of uplift models and provides empirical evidence. This finding poses a challenge to the application of uplift models, especially in scenarios requiring efficient treatment allocation. Future research can explore how to correct this bias to improve the effectiveness of uplift models. ### Significance - **Theoretical Contribution**: The paper fills a research gap in the field of uplift modeling regarding heteroskedasticity, revealing an important statistical issue. - **Practical Significance**: For companies relying on uplift models for treatment allocation, this finding alerts them to potential model biases, prompting them to take measures to improve model performance. ### Possible Solutions - **Model Adjustments**: Develop new methods to handle heteroskedasticity, such as using weighted least squares or other robust statistical methods. - **Evaluation Metrics**: Introduce new evaluation metrics to detect and quantify ranking bias, ensuring the reliability and effectiveness of the model. By implementing these methods, the impact of heteroskedasticity on the ranking of uplift models can be reduced, leading to more effective treatment allocation.

The impact of heteroskedasticity on uplift modeling

Expectile regression for analyzing heteroscedasticity in high dimension

Improving uplift model evaluation on RCT data

Pessimistic Uplift Modeling

Uplift Modeling with Multiple Treatments and General Response Types

Uplift vs. predictive modeling: a theoretical analysis

Quantifying uncertainty of uplift: Trees and T-learners

Uplift modeling with quasi-loss-functions

A Practically Competitive and Provably Consistent Algorithm for Uplift Modeling

Adapting Neural Networks for Uplift Models

A Unified Survey of Treatment Effect Heterogeneity Modelling and Uplift Modelling

A Twin Neural Model for Uplift

Learning to rank for uplift modeling

Boosting algorithms for uplift modeling

Estimating treatment effect heterogeneity in randomized program evaluation

Affordable Uplift: Supervised Randomization in Controlled Experiments

Improving Data-driven Heterogeneous Treatment Effect Estimation Under Structure Uncertainty

Feature Selection Methods for Uplift Modeling and Heterogeneous Treatment Effect

Uplift Modeling for Multiple Treatments with Cost Optimization

A Literature Survey and Experimental Evaluation of the State-of-the-Art in Uplift Modeling: A Stepping Stone Toward the Development of Prescriptive Analytics

Data-Driven Estimation of Heterogeneous Treatment Effects