Abstract:OBJECTIVE: <AbstractText Label="OBJECTIVE" NlmCategory="OBJECTIVE">To study the suitability of costsensitive ordinal artificial intelligence-machine learning (AIML) strategies in the prognosis of SARS-CoV-2 pneumonia severity.</AbstractText>MATERIALS & METHODS: <AbstractText Label="MATERIALS &amp; METHODS" NlmCategory="METHODS">Observational, retrospective, longitudinal, cohort study in 4 hospitals in Spain. Information regarding demographic and clinical status was supplemented by socioeconomic data and air pollution exposures. We proposed AI-ML algorithms for ordinal classification via ordinal decomposition and for cost-sensitive learning via resampling techniques. For performance-based model selection, we defined a custom score including per-class sensitivities and asymmetric misprognosis costs. 260 distinct AI-ML models were evaluated via 10 repetitions of 5×5 nested cross-validation with hyperparameter tuning. Model selection was followed by the calibration of predicted probabilities. Final overall performance was compared against five well-established clinical severity scores and against a 'standard' (non-cost sensitive, non-ordinal) AI-ML baseline. In our best model, we also evaluated its explainability with respect to each of the input variables.</AbstractText>RESULTS: <AbstractText Label="RESULTS" NlmCategory="RESULTS">The study enrolled n = 1548 patients: 712 experienced low, 238 medium, and 598 high clinical severity. d = 131 variables were collected, becoming d ' = 148 features after categorical encoding. Model selection resulted in our best-performing AI-ML pipeline having: a) no imputation of missing data, b) no feature selection (i.e. using the full set of d ' features), c) 'Ordered Partitions' ordinal decomposition, d) cost-based reimbalance, and e) a Histogram-based Gradient Boosting classifier. This best model (calibrated) obtained a median accuracy of 68.1% [67.3%, 68.8%] (95% confidence interval), a balanced accuracy of 57.0% [55.6%, 57.9%], and an overall area under the curve (AUC) 0.802 [0.795, 0.808]. In our dataset, it outperformed all five clinical severity scores and the 'standard' AI-ML baseline.</AbstractText>DISCUSSION & CONCLUSION: <AbstractText Label="DISCUSSION &amp; CONCLUSION" NlmCategory="CONCLUSIONS">We conducted an exhaustive exploration of AI-ML methods designed for both ordinal and cost-sensitive classification, motivated by a real-world application domain (clinical severity prognosis) in which these topics arise naturally. Our model with the best classification performance exploited successfully the ordering information of ground truth classes, coping with imbalance and asymmetric costs. However, these ordinal and cost-sensitive aspects are seldom explored in the literature.</AbstractText>

Survey of the loss function in classification models: Comparative study in healthcare and medicine

Influence of cost/loss functions on classification rate: A comparative study across diverse classifiers and domains

A Comprehensive Survey of Loss Functions in Machine Learning

Examining different cost ratio frameworks for decision rule machine learning algorithms in diagnostic application

Cost-sensitive performance metric for comparing multiple ordinal classifiers

A survey and taxonomy of loss functions in machine learning

Using random forest for reliable classification and cost-sensitive learning for medical diagnosis

On Loss Functions for Deep Neural Networks in Classification

Decision Curve Analysis: a Technical Note

To do or not to do: cost-sensitive causal decision-making

A study on cost behaviors of binary classification measures in class-imbalanced problems

Building a challenging medical dataset for comparative evaluation of classifier capabilities

Cost-sensitive ordinal classification methods to predict SARS-CoV-2 pneumonia severity

Evaluating Binary Outcome Classifiers Estimated from Survey Data

Optimization of Selective Ensemble for Cost-Sensitive Classification: an Empirical Study

On the Rates of Convergence from Surrogate Risk Minimizers to the Bayes Optimal Classifier.

A comparative evaluation of the generalised predictive ability of eight machine learning algorithms across ten clinical metabolomics data sets for binary classification

Performance Evaluation of Regression Models in Predicting the Cost of Medical Insurance

A comparison of methods for model selection when estimating individual treatment effects

Optimal Credit Scorecard Model Selection Using Costs Arising from Both False Positives and False Negatives

Addressing Binary Classification over Class Imbalanced Clinical Datasets Using Computationally Intelligent Techniques