A$^3$DSegNet: Anatomy-aware artifact disentanglement and segmentation network for unpaired segmentation, artifact reduction, and modality translation

Yuanyuan Lyu,Haofu Liao,Heqin Zhu,S. Kevin Zhou
DOI: https://doi.org/10.48550/arXiv.2001.00339
2021-03-09
Abstract:Spinal surgery planning necessitates automatic segmentation of vertebrae in cone-beam computed tomography (CBCT), an intraoperative imaging modality that is widely used in intervention. However, CBCT images are of low-quality and artifact-laden due to noise, poor tissue contrast, and the presence of metallic objects, causing vertebra segmentation, even manually, a demanding task. In contrast, there exists a wealth of artifact-free, high quality CT images with vertebra annotations. This motivates us to build a CBCT vertebra segmentation model using unpaired CT images with annotations. To overcome the domain and artifact gaps between CBCT and CT, it is a must to address the three heterogeneous tasks of vertebra segmentation, artifact reduction and modality translation all together. To this, we propose a novel anatomy-aware artifact disentanglement and segmentation network (A$^3$DSegNet) that intensively leverages knowledge sharing of these three tasks to promote learning. Specifically, it takes a random pair of CBCT and CT images as the input and manipulates the synthesis and segmentation via different decoding combinations from the disentangled latent layers. Then, by proposing various forms of consistency among the synthesized images and among segmented vertebrae, the learning is achieved without paired (i.e., anatomically identical) data. Finally, we stack 2D slices together and build 3D networks on top to obtain final 3D segmentation result. Extensive experiments on a large number of clinical CBCT (21,364) and CT (17,089) images show that the proposed A$^3$DSegNet performs significantly better than state-of-the-art competing methods trained independently for each task and, remarkably, it achieves an average Dice coefficient of 0.926 for unpaired 3D CBCT vertebra segmentation.
Image and Video Processing,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the automatic segmentation of vertebrae in cone - beam computed tomography (CBCT) images. Due to the low - quality CBCT images and the presence of artifacts, even manual segmentation of vertebrae is a challenging task. However, there are a large number of artifact - free, high - quality CT images and their vertebrae - labeled data. This inspired the authors to use unpaired CT images and labels to construct a CBCT vertebrae - segmentation model. To overcome the domain gap and the artifact gap between CBCT and CT, three heterogeneous tasks must be solved simultaneously: vertebrae segmentation, artifact reduction, and modality conversion. To this end, the authors proposed a new Anatomy - Aware Artifact Disentanglement and Segmentation Network (A3DSegNet), which promotes learning by sharing knowledge among these tasks. Specifically, this network accepts a random pair of CBCT and CT images as input, and manipulates synthesis and segmentation through different decoding combinations from the disentangled latent layers. By proposing various forms of consistency between the synthesized images and the segmented vertebrae, learning without paired (i.e., anatomically identical) data is achieved. Finally, by stacking 2D slices and building a 3D network on this basis, the final 3D segmentation results are obtained. A large number of clinical CBCT (21,364) and CT (17,089) image experiments show that the proposed A3DSegNet significantly outperforms the state - of - the - art methods trained independently for each task, and achieves a performance with an average Dice coefficient of 0.926 in unpaired 3D CBCT vertebrae segmentation.