Efficient and Stable Cephalometric Landmark Localization Using Two-Stage Heatmaps’ Regression

Xianglong Wang,Eric Rigall,Qianmin Chen,Shu Zhang,Junyu Dong
DOI: https://doi.org/10.1109/tim.2022.3206762
IF: 5.6
2022-01-01
IEEE Transactions on Instrumentation and Measurement
Abstract:Cephalometry is commonly used as an integral component in orthodontic treatment and visual treatment objects (VTOs). Accurate and stable tissue landmarks’ localization is an essential precondition for orthodontic diagnosis and treatment. In practice, expert manual annotation is often time-consuming and subjective. Recent works based on deep learning have two approaches for automatic cephalometric assessment, namely, multimodel-and end-to-end model-based methods, leading to substantial differences in landmarks’ localization accuracy. The multimodel-based methods predict each landmark with higher localization accuracy, but they are complex, and their parameters have to be adjusted iteratively. The end-to-end models are both lightweight and efficient, and are able to predict all landmarks at once, but their localization accuracy is significantly lower than that of the multimodel-based methods. This article aims at rethinking the different natures among models and presents an effective solution for the low accuracy end-to-end model. On the one hand, we design a two-stage network architecture to improve the model representation ability. We found that the input image with high resolution and multistage strategy can effectively improve the localization accuracy of the model. Therefore, the former stage employs an HR-Net to output multiple-scale feature maps in parallel from high-resolution radiography images. The latter stage further refines the landmark location from the generated features. On the other hand, thanks to its spatial representation of landmark location, the network-based heatmap prediction has become a mainstream method for regressing the landmarks’ coordinates thanks. However, the decoding method of heatmap has not been studied in cephalometry. Thus, we propose an improved heatmaps’ decoding method that effectively reduces the landmarks’ localization error during training. Furthermore, experimental results show that our approach outperforms previous related works on a public dataset and yields state-of-the-art (SoTA) accuracy in cephalometric landmarks’ detection.
What problem does this paper attempt to address?