Automatic Detection of Spine Region Using Multiple Pseudo 3D U-Net Models with Weighted Average Voting and Attention Mechanisms

Kai Yang,Masayuki Kikuchi
DOI: https://doi.org/10.18178/joig.12.2.152-157
2024-01-01
Journal of Image and Graphics
Abstract:The field of CT imaging has been witnessing significant advancements. However, extracting precise information from complex image data remains a challenging task. This study focuses on automating the extraction of CT images. In our study, we adopt the U-Net architecture, a multi-scale blurring technique on data, to obtain a multi-resolution representation. This method is specifically designed to capture information at various granularities, from more detailed information to broader structures. After applying this multi-step blur, we calculate the difference between adjacent images to take advantage of the change in situation between different resolutions. Although feeding the blurred results Directly into the U-Net model may yield satisfactory results, our approach to computing differences between blurred images focuses on the nuances of these changes. To further enhance accuracy, we focused on ensemble learning, leveraging the weights from the training processes of multiple models to average their output during prediction. The results demonstrated that by adopting our approach, we achieved a Dice accuracy of 96.8% and improved the accuracy of CT image extraction.
What problem does this paper attempt to address?