Echocardiographic Image Multi-Structure Segmentation Using Cardiac-SegNet.

Yang Lei,Yabo Fu,Justin Roper,Kristin Higgins,Jeffrey D. Bradley,Walter J. Curran,Tian Liu,Xiaofeng Yang
DOI: https://doi.org/10.1002/mp.14818
IF: 4.506
2021-01-01
Medical Physics
Abstract:PurposeCardiac boundary segmentation of echocardiographic images is important for cardiac function assessment and disease diagnosis. However, it is challenging to segment cardiac ventricles due to the low contrast‐to‐noise ratio and speckle noise of the echocardiographic images. Manual segmentation is subject to interobserver variability and is too slow for real‐time image‐guided interventions. We aim to develop a deep learning‐based method for automated multi‐structure segmentation of echocardiographic images.MethodsWe developed an anchor‐free mask convolutional neural network (CNN), termed Cardiac‐SegNet, which consists of three subnetworks, that is, a backbone, a fully convolutional one‐state object detector (FCOS) head, and a mask head. The backbone extracts multi‐level and multi‐scale features from endocardium image. The FOCS head utilizes these features to detect and label the region‐of‐interests (ROIs) of the segmentation targets. Unlike the traditional mask regional CNN (Mask R‐CNN) method, the FCOS head is anchor‐free and can model the spatial relationship of the targets. The mask head utilizes a spatial attention strategy, which allows the network to highlight salient features to perform segmentation on each detected ROI. For evaluation, we investigated 450 patient datasets by a five‐fold cross‐validation and a hold‐out test. The endocardium (LVEndo) and epicardium (LVEpi) of the left ventricle and left atrium (LA) were segmented and compared with manual contours using the Dice similarity coefficient (DSC), Hausdorff distance (HD), mean absolute distance (MAD), and center‐of‐mass distance (CMD).ResultsCompared to U‐Net and Mask R‐CNN, our method achieved higher segmentation accuracy and fewer erroneous speckles. When our method was evaluated on a separate hold‐out dataset at the end diastole (ED) and the end systole (ES) phases, the average DSC were 0.952 and 0.939 at ED and ES for the LVEndo, 0.965 and 0.959 at ED and ES for the LVEpi, and 0.924 and 0.926 at ED and ES for the LA. For patients with a typical image size of 549 × 788 pixels, the proposed method can perform the segmentation within 0.5 s.ConclusionWe proposed a fast and accurate method to segment echocardiographic images using an anchor‐free mask CNN.
What problem does this paper attempt to address?