Multimodal deep learning approach to predicting neurological recovery from coma after cardiac arrest

Felix H. Krones,Ben Walker,Guy Parsons,Terry Lyons,Adam Mahdi
2024-03-10
Abstract:This work showcases our team's (The BEEGees) contributions to the 2023 George B. Moody PhysioNet Challenge. The aim was to predict neurological recovery from coma following cardiac arrest using clinical data and time-series such as multi-channel EEG and ECG signals. Our modelling approach is multimodal, based on two-dimensional spectrogram representations derived from numerous EEG channels, alongside the integration of clinical data and features extracted directly from EEG recordings. Our submitted model achieved a Challenge score of $0.53$ on the hidden test set for predictions made $72$ hours after return of spontaneous circulation. Our study shows the efficacy and limitations of employing transfer learning in medical classification. With regard to prospective implementation, our analysis reveals that the performance of the model is strongly linked to the selection of a decision threshold and exhibits strong variability across data splits.
Machine Learning,Signal Processing
What problem does this paper attempt to address?
This paper mainly discusses how to use multimodal deep learning methods to predict the neurological recovery status of patients in a comatose state after cardiac arrest. The research team (The BEEGees) participated in the 2023 George B. Moody PhysioNet Challenge, with the goal of predicting the extent of patient recovery after cardiac arrest based on clinical data and time series data such as multi-channel EEG and ECG signals, classifying it as "poor" or "good". They adopted two methods: one is to generate 2D spectrograms from multi-channel EEG signals and combine them with clinical data; the other is to directly extract features from EEG records. On 607 training cases, they built six different models (M1 to M6), among which M5 achieved the highest challenge score of 0.53 on the hidden test set. The paper emphasizes the close relationship between model performance and the selection of decision thresholds, as well as the significant variability of the model on different data splits. The results show that although multimodal methods (M2-M6) perform strongly on the local validation set (CV AUC>0.81), there are issues with generalization and further research is needed. In addition, the paper discusses previous work, which mainly relies on CNN and a small number of cases for prediction models, while their research emphasizes the importance of feature engineering and data preprocessing in improving model performance. Future research directions may include comparing different modeling methods, exploring self-supervised pre-training, and integrating other time series data, such as ECG.