Unsupervised Convolutional Neural Network for Motion Estimation in Ultrasound Elastography

Xingyue Wei,Yuanyuan Wang,Lin Ge,Bo Peng,Qiong He,Rui Wang,Lijie Huang,Yan Xu,Jianwen Luo
DOI: https://doi.org/10.1109/TUFFC.2022.3171676
2022-01-01
Abstract:High-quality motion estimation is essential for ultrasound elastography (USE). Traditional motion estimation algorithms based on speckle tracking such as normalized cross correlation (NCC) or regularization such as global ultrasound elastography (GLUE) are time-consuming. In order to reduce the computational cost and ensure the accuracy of motion estimation, many convolutional neural networks have been introduced into USE. Most of these networks such as radio-frequency modified pyramid, warping and cost volume network (RFMPWC-Net) are supervised and need many ground truths as labels in network training. However, the ground truths are laborious to collect for USE. In this study, we proposed a MaskFlownet-based unsupervised convolutional neural network (MF-UCNN) for fast and high-quality motion estimation in USE. The inputs to MF-UCNN are the concatenation of RF, envelope, and B-mode images before and after deformation, while the outputs are the axial and lateral displacement fields. The similarity between the predeformed image and the warped image (i.e., the postdeformed image compensated by the estimated displacement fields) and the smoothness of the estimated displacement fields were incorporated in the loss function. The network was compared with modified pyramid, warping and cost volume network (MPWC-Net)++, RFMPWC-Net, GLUE, and NCC. Results of simulations, breast phantom, and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">in vivo</i> experiments show that MF-UCNN obtains higher signal-to-noise ratio (SNR) and higher contrast-to-noise ratio (CNR). MF-UCNN achieves high-quality motion estimation with significantly reduced computation time. It is unsupervised and does not need any ground truths as labels in the training, and, thus, has great potential for motion estimation in USE.
What problem does this paper attempt to address?