Abstract:Objective: The prediction of sepsis, especially early diagnosis, has received a significant attention in biomedical research. In order to improve current medical scoring system and overcome the limitations of class imbalance and sample size of local EHR (electronic health records), we propose a novel knowledge-transfer-based approach, which combines a medical scoring system and an ordinal logistic regression model. Materials and methods: Medical scoring systems (i.e. NEWS, SIRS and QSOFA) are generally robust and useful for sepsis diagnosis. With local EHR, machine-learning-based methods have been widely used for building prediction models/methods, but they are often impacted by class imbalance and sample size. Knowledge distillation and knowledge transfer have recently been proposed as a combination approach for improving the prediction performance and model generalization. In this study, we developed a novel knowledge-transfer-based method for combining a medical scoring system (after a proposed score transformation) and an ordinal logistic regression model. We mathematically confirmed that it was equivalent to a specific form of the weighted regression. Furthermore, we theoretically explored its effectiveness in the scenario of class imbalance. Results: For the local dataset and the MIMIC-IV dataset, the VUS (the volume under the multi-dimensional ROC surface, a generalization measure of AUC-ROC for ordinal categories) of the knowledge-transfer-based model (ORNEWS) based on the NEWS scoring system were 0.384 and 0.339, respectively, while the VUS of the traditional ordinal regression model (OR) were 0.352 and 0.322, respectively. Consistent analysis results were also observed for the knowledge-transfer-based models based on the SIRS/QSOFA scoring systems in the ordinal scenarios. Additionally, the predicted probabilities and the binary classification ROC curves of the knowledge-transfer-based models indicated that this approach enhanced the predicted probabilities for the minority classes while reducing the predicted probabilities for the majority classes, which improved AUCs/VUSs on imbalanced data. Discussion: Knowledge transfer, which combines a medical scoring system and a machine-learning-based model, improves the prediction performance for early diagnosis of sepsis, especially in the scenarios of class imbalance and limited sample size.

Semi-supervised Transfer Learning for Evaluation of Model Classification Performance

Semisupervised transfer learning for evaluation of model classification performance

Semi-supervised Optimal Transport with Self-paced Ensemble for Cross-hospital Sepsis Early Detection.

Semi-Supervised Approaches to Efficient Evaluation of Model Prediction Performance

Doubly Robust Augmented Model Accuracy Transfer Inference with High Dimensional Features

Semi-Supervised Triply Robust Inductive Transfer Learning

Online Transfer Learning for RSV Case Detection

Transport-based transfer learning on Electronic Health Records: Application to detection of treatment disparities

Doubly Supervised Transfer Classifier for Computer-Aided Diagnosis with Imbalanced Modalities

Efficient Evaluation of Prediction Rules in Semi-Supervised Settings under Stratified Sampling

Efficient Estimation and Evaluation of Prediction Rules in Semi-Supervised Settings under Stratified Sampling

Semi-supervised ROC analysis for reliable and streamlined evaluation of phenotyping algorithms

A hybrid adaptive approach for instance transfer learning with dynamic and imbalanced data

Research on Osteoporosis Risk Assessment Based on Semi-supervised Machine Learning

A General Fine-tuned Transfer Learning Model for Predicting Clinical Task Acrossing Diverse EHRs Datasets

Transfer Learning under the Cox Model with Interval‐censored Data

Transfer or Self-Supervised? Bridging the Performance Gap in Medical Imaging

A knowledge-transfer-based approach for combining ordinal regression and medical scoring system in the early prediction of sepsis with electronic health records

Improving an Electronic Health Record–Based Clinical Prediction Model Under Label Deficiency: Network-Based Generative Adversarial Semisupervised Approach

A Semiparametric Approach for Robust and Efficient Learning with Biobank Data

A Comparison of Self-Supervised Pretraining Approaches for Predicting Disease Risk from Chest Radiograph Images