FUSQA: Fetal Ultrasound Segmentation Quality Assessment

Sevim Cengiz,Ibrahim Almakky,Mohammad Yaqub
2023-08-15
Abstract:Deep learning models have been effective for various fetal ultrasound segmentation tasks. However, generalization to new unseen data has raised questions about their effectiveness for clinical adoption. Normally, a transition to new unseen data requires time-consuming and costly quality assurance processes to validate the segmentation performance post-transition. Segmentation quality assessment efforts have focused on natural images, where the problem has been typically formulated as a dice score regression task. In this paper, we propose a simplified Fetal Ultrasound Segmentation Quality Assessment (FUSQA) model to tackle the segmentation quality assessment when no masks exist to compare with. We formulate the segmentation quality assessment process as an automated classification task to distinguish between good and poor-quality segmentation masks for more accurate gestational age estimation. We validate the performance of our proposed approach on two datasets we collect from two hospitals using different ultrasound machines. We compare different architectures, with our best-performing architecture achieving over 90% classification accuracy on distinguishing between good and poor-quality segmentation masks from an unseen dataset. Additionally, there was only a 1.45-day difference between the gestational age reported by doctors and estimated based on CRL measurements using well-segmented masks. On the other hand, this difference increased and reached up to 7.73 days when we calculated CRL from the poorly segmented masks. As a result, AI-based approaches can potentially aid fetal ultrasound segmentation quality assessment and might detect poor segmentation in real-time screening in the future.
Image and Video Processing,Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The paper addresses the challenge of ensuring the reliability and accuracy of deep learning models used for fetal ultrasound segmentation, particularly when these models are applied to new, unseen data. The authors highlight that while deep learning models have shown promise in fetal ultrasound segmentation tasks, their effectiveness can be compromised when transitioning to new datasets, requiring costly and time-consuming quality assurance processes. To tackle this issue, the paper proposes a Fetal Ultrasound Segmentation Quality Assessment (FUSQA) model. This model automates the classification of segmentation masks into good or poor quality, thereby enabling a more efficient and accurate assessment of fetal ultrasound segmentation quality. Specifically, the FUSQA model aims to: 1. **Automate Quality Assessment**: Develop an automated method to assess the quality of segmentation masks produced by deep learning models, without the need for manual annotation. 2. **Improve Clinical Outcomes**: Ensure that only high-quality segmentation masks are used for critical clinical tasks, such as gestational age estimation based on the crown-rump length (CRL). 3. **Simplify Transferability**: Create a model that can easily transfer its performance to new, unseen datasets, reducing the need for extensive retraining or manual checks. The FUSQA model is evaluated on 2 datasets collected from different hospitals, demonstrating its effectiveness in distinguishing between good and poor-quality segmentation masks.