Abstract:Purpose/Objective(s) To develop an automated lung cancer segmentation method using dual-modality imaging and deep learning, and to perform clinical evaluation of the method. Materials/Methods A 3D neural network with dual inputs from diagnostic PET and simulation CT was constructed based on U-Net. The architecture consisted of two parallel convolution paths for independent feature extractions from PET and CT at multiple resolution levels and a single deconvolution path. At each resolution level, the extracted features from the convolution arms were concatenated and fed into the deconvolution path through skip connections. The network was trained/validated/tested by a 3:1:1 split on a dataset of 290 pairs of PET and CT from lung cancer patients treated at our institution, with manual physician contours as the ground truth. The performance of the 3D dual-modality network was compared against that of a CT-only network. evaluated both the manual and the network-produced tumor contours of a randomly selected subset of 20 cases (10 large and 10 small) in a blinded fashion. Results The mean Dice similarity coefficient (DSC), Hausdorff Distance (HD), and bi-directional local distance (BLD) comparing the automatic contours versus the ground truth were 0.77 ± 0.12, 7.6 ± 4.7 mm, and 2.9 ± 1.4 mm, and 0.79 ± 0.10, 5.8 ± 3.2 mm, and 2.8 ± 1.5 mm for dual modality inputs, respectively. The stratification method delivered the best results when the model for the large GTV subset (> 25 ml) was trained with GTVs of all sizes (DSC, HD, BLD of 0.85 ± 0.05, 9.5 ± 3.8 mm, and 3.8 ± 1.8 mm), and that for the small GTV subset (< 25 ml) was trained with small GTVs only (DSC, HD, BLD of 0.82 ± 0.08, 3.7 ± 1.7 mm, and 2.2 ± 1.1 mm). From the stratified results, the best combined overall DSC, HD, and BLD were 0.83 ± 0.07, 5.9 ± 2.5 mm, and 2.8 ± 1.4 mm, respectively. In the multi-observer review, on average 91.25% of manual vs. 88.75% of automatic contours were Accepted or Accepted with Modifications. (50% Accepted, 41.25% Accepted w/ Mods and 6.25% Rejected for manual vs. 18.75%, 70%, and 8.75% for automatic), the modifications on the automatic contours were relatively minor with a mean DSC of 0.92 ± 0.04 between the original and modified, comparable to the mean DSC of 0.90 ± 0.04 between all the modified contours and their manual ground truths. Conclusion By utilizing an expansive clinical PET and CT image database and a dual-modality architecture, the proposed 3D network with a novel GTV volume-based stratification strategy was able to generate clinically useful lung cancer contours that were quantitatively similar to the ground truth and highly acceptable in physician review.

Automated Lung Cancer Segmentation Using a Dual-Modality Deep Learning Network with PET and CT Images