Abstract:Objectives: Radiation therapy for lung cancer requires a gross tumour volume (GTV) to be carefully outlined by a skilled radiation oncologist (RO) to accurately pinpoint high radiation dose to a malignant mass while simultaneously minimizing radiation damage to adjacent normal tissues. This is manually intensive and tedious however, it is feasible to train a deep learning (DL) neural network that could assist ROs to delineate the GTV. However, DL trained on large openly accessible data sets might not perform well when applied to a superficially similar task but in a different clinical setting. In this work, we tested the performance of DL automatic lung GTV segmentation model trained on open-access Dutch data when used on Indian patients from a large public tertiary hospital, and hypothesized that generic DL performance could be improved for a specific local clinical context, by means of modest transfer-learning on a small representative local subset. Methods: X-ray computed tomography (CT) series in a public data set called "NSCLC-Radiomics" from The Cancer Imaging Archive was first used to train a DL-based lung GTV segmentation model (Model 1). Its performance was assessed using a different open access data set (Interobserver1) of Dutch subjects plus a private Indian data set from a local tertiary hospital (Test Set 2). Another Indian data set (Retrain Set 1) was used to fine-tune the former DL model using a transfer learning method. The Indian data sets were taken from CT of a hybrid scanner based in nuclear medicine, but the GTV was drawn by skilled Indian ROs. The final (after fine-tuning) model (Model 2) was then re-evaluated in "Interobserver1" and "Test Set 2." Dice similarity coefficient (DSC), precision, and recall were used as geometric segmentation performance metrics. Results: Model 1 trained exclusively on Dutch scans showed a significant fall in performance when tested on "Test Set 2." However, the DSC of Model 2 recovered by 14 percentage points when evaluated in the same test set. Precision and recall showed a similar rebound of performance after transfer learning, in spite of using a comparatively small sample size. The performance of both models, before and after the fine-tuning, did not significantly change the segmentation performance in "Interobserver1." Conclusions: A large public open-access data set was used to train a generic DL model for lung GTV segmentation, but this did not perform well initially in the Indian clinical context. Using transfer learning methods, it was feasible to efficiently and easily fine-tune the generic model using only a small number of local examples from the Indian hospital. This led to a recovery of some of the geometric segmentation performance, but the tuning did not appear to affect the performance of the model in another open-access data set. Advances in knowledge: Caution is needed when using models trained on large volumes of international data in a local clinical setting, even when that training data set is of good quality. Minor differences in scan acquisition and clinician delineation preferences may result in an apparent drop in performance. However, DL models have the advantage of being efficiently "adapted" from a generic to a locally specific context, with only a small amount of fine-tuning by means of transfer learning on a small local institutional data set.

Transfer learning for auto‐segmentation of 17 organs‐at‐risk in the head and neck: Bridging the gap between institutional and public datasets

Comprehensive and Clinically Accurate Head and Neck Organs at Risk Delineation Via Stratified Deep Learning: A Large-scale Multi-Institutional Study

Automatic segmentation of Organs at Risk in Head and Neck cancer patients from CT and MRI scans

Multi-organ segmentation of organ-at-risk (OAR's) of head and neck site using ensemble learning technique

Deep Learning-Augmented Head and Neck Organs at Risk Segmentation From CT Volumes

Investigating transfer learning to improve the deep-learning-based segmentation of organs at risk among different medical centers for nasopharyngeal carcinoma

Clinically Applicable Deep Learning Framework for Organs at Risk Delineation in CT Images

A deep learning-based auto-segmentation system for organs-at-risk on whole-body computed tomography images for radiation therapy

Transfer learning from a sparsely annotated dataset of 3D medical images

Comparing the performance of a deep learning-based lung gross tumour volume segmentation algorithm before and after transfer learning in a new hospital

Deep Learning-Based Segmentation of Head and Neck Organs-at-Risk with Clinical Partially Labeled Data

Deep Learning Auto-Segmentation Network for Paediatric CT Datasets: Can We Extrapolate from Adults?

Auto-segmentation of Adult-Type Diffuse Gliomas: Comparison of Transfer Learning-Based Convolutional Neural Network Model Vs. Radiologists

Improved accuracy of auto-segmentation of organs at risk in radiotherapy planning for nasopharyngeal carcinoma based on fully convolutional neural network deep learning

Deep learning to achieve clinically applicable segmentation of head and neck anatomy for radiotherapy

The impact of training dataset size and ensemble inference strategies on head and neck auto-segmentation

FocusNet: Imbalanced Large and Small Organ Segmentation with an End-to-End Deep Neural Network for Head and Neck CT Images

AttentionAnatomy: A Unified Framework for Whole-Body Organs at Risk Segmentation Using Multiple Partially Annotated Datasets.

Clinically Acceptable Segmentation of Organs at Risk in Cervical Cancer Radiation Treatment from Clinically Available Annotations

Comparative Clinical Evaluation of Deep-Learning-Based Algorithms in Auto-Segmentation of Organs-At-Risk for Head and Neck Cancers

FocusNetv2: Imbalanced Large and Small Organ Segmentation with Adversarial Shape Constraint for Head and Neck CT Images