Abstract:Intelligent computer systems aim to help humans in making decisions. Many practical decision-making problems are classification problems in their nature, but standard classification algorithms often not applicable since they assume balanced distribution of classes and constant misclassification costs. From this point of view, algorithms that consider the cost of decisions are essential since they are more consistent with the requirements of real life. These algorithms generate decisions that directly optimize parameters valuable for business, for example, the costs savings. But despite on practical value of cost-sensitive algorithms, the little number of works study this problem concentrating mainly on the case when the cost of a classifier error is constant and does not depend on a specific example. However, many real-world classification tasks are example-dependent cost-sensitive (ECS), where the costs of misclassification vary between examples and not only within classes. Existing methods of ECS learning include just modifications of the simplest models of machine learning (naive Bayes, logistic regression, decision tree). These models produce promising results, but there is a need for further improvement in performance that can be achieved by using gradient-based ensemble methods. To break this gap, we present the ECS generalization of AdaBoost. We study three models which differ by the ways to introduce cost into the loss function: inside the exponent, outside the exponent, and both inside and outside the exponent. The results of the experiments on three synthetic and two real datasets (bank marketing and insurance fraud) show that example-dependent cost-sensitive modifications of AdaBoost outperform other known models. Empirical results also show that critical factors influencing the choice of the model are not only the distribution of features, which is typical for cost-insensitive and class-dependent cost-sensitive problems but also the distribution of costs. Next, since the outputs of AdaBoost are not well calibrated posterior probabilities, we check three approaches to calibration of classifier scores: Platt scaling, isotonic regression, and ROC modification. The results show that calibration not only significantly improves the performance of specific ECS models but allows making better capabilities of original AdaBoost. Obtained results provide new insight regarding the behavior of the cost-sensitive model from a theoretical point of view and prove that the presented approach can significantly improve the practical design of intelligent systems.

Predict-then-optimize or predict-and-optimize? An empirical evaluation of cost-sensitive learning strategies

Predict-Then-Optimize by Proxy: Learning Joint Models of Prediction and Optimization

Indexing Cost Sensitive Prediction

Cost-Sensitive Learning for Predictive Maintenance

Cost-Sensitive Classification Using Decision Trees, Boosting and MetaCost

The Perils of Learning Before Optimizing

To do or not to do: cost-sensitive causal decision-making

Towards cost-sensitive adaptation: When is it worth updating your predictive model?

Asymptotically Optimal Regret for Black-Box Predict-then-Optimize

Optimization of Selective Ensemble for Cost-Sensitive Classification: an Empirical Study

An adaptive cost-sensitive learning approach in neural networks to minimize local training–test class distributions mismatch

A Note on Task-Aware Loss via Reweighing Prediction Loss by Decision-Regret

Multi-Task Predict-then-Optimize

Addressing misspecification in contextual optimization

Robust Losses for Decision-Focused Learning

Example-dependent cost-sensitive adaptive boosting

Test cost and misclassification cost trade-off using reframing

Fast Rates for Contextual Linear Optimization

Decision-Focused Learning without Differentiable Optimization: Learning Locally Optimized Decision Losses

Decision Trees for Decision-Making under the Predict-then-Optimize Framework

An adaptive Cost-sensitive Classifier