Deep Active Learning for Interictal Ictal Injury Continuum EEG Patterns.
Wendong Ge,Jin Jing,Sungtae An,Aline Herlopian,Marcus Ng,Aaron F. Struck,Brian Appavu,Emily L. Johnson,Gamaleldin Osman,Hiba A. Haider,Ioannis Karakis,Jennifer A. Kim,Jonathan J. Halford,Monica B. Dhakarl,Rani A. Sarkis,Christa B. Swisher,Sarah Schmitt,Jong Woo Lee,Mohammad Tabaeizadeh,Andres Rodriguezi,Nicolas Gaspard,Emily Gilmore,Susan T. Herman,Peter W. Kaplan,Jay Pathmanathan,Shenda Hong,Eric S. Rosenthal,Sahar Zafar,Jimeng Sun,M. Brandon Westover
DOI: https://doi.org/10.1016/j.jneumeth.2020.108966
IF: 2.987
2021-01-01
Journal of Neuroscience Methods
Abstract:Objectives: Seizures and seizure-like electroencephalography (EEG) patterns, collectively referred to as "ictal interictal injury continuum" (IIIC) patterns, are commonly encountered in critically ill patients. Automated detection is important for patient care and to enable research. However, training accurate detectors requires a large labeled dataset. Active Learning (AL) may help select informative examples to label, but the optimal AL approach remains unclear. Methods: We assembled >200,000 h of EEG from 1,454 hospitalized patients. From these, we collected 9,808 labeled and 120,000 unlabeled 10-second EEG segments. Labels included 6 IIIC patterns. In each AL iteration, a Dense-Net Convolutional Neural Network (CNN) learned vector representations for EEG segments using available labels, which were used to create a 2D embedding map. Nearest-neighbor label spreading within the embedding map was used to create additional pseudo-labeled data. A second Dense-Net was trained using real- and pseudo-labels. We evaluated several strategies for selecting candidate points for experts to label next. Finally, we compared two methods for class balancing within queries: standard balanced-based querying (SBBQ), and high confidence spread-based balanced querying (HCSBBQ). Results: Our results show: 1) Label spreading increased convergence speed for AL. 2) All query criteria produced similar results to random sampling. 3) HCSBBQ query balancing performed best. Using label spreading and HCSBBQ query balancing, we were able to train models approaching expert-level performance across all pattern categories after obtaining similar to 7000 expert labels. Conclusion: Our results provide guidance regarding the use of AL to efficiently label large EEG datasets in critically ill patients.