GEL-TTA Net: a Global ensemble learning network for the localization of small-scale and mixed intracranial hemorrhages through test time augmentations

DOI: https://doi.org/10.1007/s11042-024-19393-4
IF: 2.577
2024-06-08
Multimedia Tools and Applications
Abstract:State-of-the-art deep learning models can accurately perform multi-class classification of intracranial hemorrhages (ICH). However, two main challenges such as the localization of multiple hemorrhages and the visualization of small-scale or subtle hemorrhages have not been addressed yet. This study proposed an optimal object detection framework to solve both issues. First, a YOLOv5l architecture was used as model 1 to localize multiple hemorrhages. Second, a vision transformer (ViT) based on multi-head-self-attention (MHSA) was used in YOLOv5x as model 2 to visualize small-scale ICH. The main advantage of the transformer module is that it performs dense prediction task using a queue, key, and value information. To achieve both objectives in a single network, the two proposed models were ensembled using a non-max-suppression (NMS) algorithm. Furthermore, a concept known as test time augmentations (TTA) was used in the proposed (GEL-TTA Net) model to promote the test time results. To improve the quality of predictions in the proposed model, we pooled the feature maps at various scales in the YOLO backbone using a spatial pyramid pooling-faster module (SPPF), whereas a path-aggregated network (PANet) was used as a neck to hold the spatial information. The proposed model was trained and validated using the brain hemorrhage extended (BHX) dataset, and testing was conducted using separate segmentation data. The experimental result shows that the proposed model outperformed by existing model (YOLOv4) in terms of precision by 1.6%, recall by 12.4%, F1 score by 7.5%, and mean average precision (mAP@.5) by 16%. The proposed model achieved an overall precision, recall, F1 score, and mAP@.5 of 93.6%, 93.4%, 93.5%, and 95.6%, respectively, during training and 85.7%, 84.9%, 85.3%, and 90.3% during validation. The results of GEL-TTA Net were outperformed in the validation and testing phases, but the only limitation was that the prediction time increased.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?