Uncertainty-Aware Deep Ensembles for Reliable and Explainable Predictions of Clinical Time Series

Kristoffer Wickstrøm,Karl Øyvind Mikalsen,Michael Kampffmeyer,Arthur Revhaug,Robert Jenssen
DOI: https://doi.org/10.1109/JBHI.2020.3042637
2020-10-16
Abstract:Deep learning-based support systems have demonstrated encouraging results in numerous clinical applications involving the processing of time series data. While such systems often are very accurate, they have no inherent mechanism for explaining what influenced the predictions, which is critical for clinical tasks. However, existing explainability techniques lack an important component for trustworthy and reliable decision support, namely a notion of uncertainty. In this paper, we address this lack of uncertainty by proposing a deep ensemble approach where a collection of DNNs are trained independently. A measure of uncertainty in the relevance scores is computed by taking the standard deviation across the relevance scores produced by each model in the ensemble, which in turn is used to make the explanations more reliable. The class activation mapping method is used to assign a relevance score for each time step in the time series. Results demonstrate that the proposed ensemble is more accurate in locating relevant time steps and is more consistent across random initializations, thus making the model more trustworthy. The proposed methodology paves the way for constructing trustworthy and dependable support systems for processing clinical time series for healthcare related tasks.
Machine Learning,Signal Processing
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is how to introduce uncertainty estimation in deep - learning prediction of clinical time - series data to improve the reliability and interpretability of model explanations. Specifically: 1. **Lack of interpretability**: Although deep neural networks (DNNs) perform well in many clinical applications, they are essentially black - box models and cannot explain which factors have influenced the prediction results. 2. **Lack of uncertainty estimation**: Although existing explanation methods can provide certain explanations, they usually lack the measurement of uncertainty, which makes DNN - based support systems less trustworthy in medical decision - making. To solve these problems, the author proposes a method based on deep ensemble. By training multiple independent DNNs and calculating the standard deviation of the correlation scores generated by each model to measure uncertainty. This method not only improves the accuracy of the model in locating relevant time steps but also enhances the stability of the model under different initializations, thereby increasing the credibility and dependability of the system. ### Specific problem description - **Interpretability problem**: How to explain the prediction results of DNNs when processing clinical time - series? - **Uncertainty problem**: How to measure and express the uncertainty in these explanations? ### Solutions - **Deep ensemble method**: Train multiple independent DNNs, and each model generates a prediction and a correlation score. - **Uncertainty estimation**: Measure uncertainty by calculating the standard deviation of the correlation scores of each model. - **Class activation mapping (CAM)**: Used to assign correlation scores to each time step to help identify important time steps. ### Verification and application - **Synthetic data experiment**: Verify the performance of the model on known relevant time steps. - **Clinical task application**: Applied to myocardial infarction detection in electrocardiogram (ECG) and blood measurement of surgical site infection (SSI). Through these methods, the author aims to construct a more trustworthy deep - learning system that supports medical decision - making.