K-Net: Integrate Left Ventricle Segmentation and Direct Quantification of Paired Echo Sequence

Rongjun Ge,Guanyu Yang,Yang Chen,Limin Luo,Cheng Feng,Hong Ma,Junyi Ren,Shuo Li
DOI: https://doi.org/10.1109/tmi.2019.2955436
IF: 10.6
2020-01-01
IEEE Transactions on Medical Imaging
Abstract:The integration of segmentation and direct quantification on the left ventricle (LV) from the paired apical views(i.e., apical 4-chamber and 2-chamber together) echo sequence clinically achieves the comprehensive cardiac assessment: multiview segmentation for anatomical morphology, and multidimensional quantification for contractile function. Direct quantification of LV, i.e., to automatically quantify multiple LV indices directly from the image via task-aware feature representation and regression, avoids accumulative error from the inter-step target. This integration sequentially makes a stereoscopical reflection of cardiac activity jointly from the paired orthogonal cross views sequences, overcoming limited observation with a single plane. We propose a K-shaped Unified Network (K-Net), the first end-to-end framework to simultaneously segment LV from apical 4-chamber and 2-chamber views, and directly quantify LV from major- and minor-axis dimensions (1D), area (2D), and volume (3D), in sequence. It works via four components: 1) the K-Net architecture with the Attention Junction enables heterogeneous tasks learning of segmentation task of pixel-wise classification, and direct quantification task of image-wise regression, by interactively introducing the information from segmentation to jointly promote spatial attention map to guide quantification focusing on LV-related region, and transferring quantification feedback to make global constraint on segmentation; 2) the Bi-ResLSTMs distributed in K-Net layer-by-layer hierarchically extract spatial-temporal information in echo sequence, with bidirectional recurrent and short-cut connection to model spatial-temporal information among all frames; 3) the Information Valve tailing the Bi-ResLSTMs selectively exchanges information among multiple views, by stimulating complementary information and suppressing redundant information to make the efficient cross-flow for each view; 4) the Evolution Loss comprehensively guides sequential data learning, with static constraint for frame values, and dynamic constraint for inter-frame value changes. The experiments show that our K-Net gains high performance with a Dice coefficient up to 91.44% and a mean absolute error of the major-axis dimension down to 2.74mm, which reveal its clinical potential.
What problem does this paper attempt to address?