A CNN-LSTM Architecture for Detection of Intracranial Hemorrhage on CT scans

Nhan T. Nguyen,Dat Q. Tran,Nghia T. Nguyen,Ha Q. Nguyen
DOI: https://doi.org/10.48550/arXiv.2005.10992
2020-06-26
Abstract:We propose a novel method that combines a convolutional neural network (CNN) with a long short-term memory (LSTM) mechanism for accurate prediction of intracranial hemorrhage on computed tomography (CT) scans. The CNN plays the role of a slice-wise feature extractor while the LSTM is responsible for linking the features across slices. The whole architecture is trained end-to-end with input being an RGB-like image formed by stacking 3 different viewing windows of a single slice. We validate the method on the recent RSNA Intracranial Hemorrhage Detection challenge and on the CQ500 dataset. For the RSNA challenge, our best single model achieves a weighted log loss of 0.0522 on the leaderboard, which is comparable to the top 3% performances, almost all of which make use of ensemble learning. Importantly, our method generalizes very well: the model trained on the RSNA dataset significantly outperforms the 2D model, which does not take into account the relationship between slices, on CQ500. Our codes and models is publicly avaiable at <a class="link-external link-https" href="https://github.com/VinBDI-MedicalImagingTeam/midl2020-cnnlstm-ich" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to accurately detect intracranial hemorrhage (ICH) in head CT scans. Specifically, the author proposes a new method that combines convolutional neural network (CNN) and long - short - term memory network (LSTM), aiming to predict the presence of intracranial hemorrhage and its five subtypes from CT scan images. This problem is very important in the medical and health field, especially in the rapid and accurate diagnosis of brain diseases, which can significantly improve the speed and accuracy of clinical decision - making. The main challenges mentioned in the paper include: - **3D data processing**: CT scan data is essentially 3D, consisting of a series of 2D slices. This poses difficulties for the direct application of deep - learning techniques based on natural images, as these techniques are usually designed for 2D images. - **Data scarcity**: Annotated medical imaging data is relatively scarce and difficult to obtain, which limits the possibility of large - scale training models. To overcome these challenges, the method proposed by the author utilizes a pre - trained CNN model to extract the features of each slice and captures the spatial dependencies between different slices through the LSTM mechanism, thereby achieving end - to - end training. This method not only makes full use of the pre - trained models on existing large - scale image datasets (such as ImageNet), but also effectively solves the problem of 3D data processing and improves the performance of the model in the intracranial hemorrhage detection task.