Deep learning generation of preclinical positron emission tomography (PET) images from low‐count PET with task‐based performance assessment

Kaushik Dutta,Richard Laforest,Jingqin Luo,Abhinav K. Jha,Kooresh I. Shoghi
DOI: https://doi.org/10.1002/mp.17105
IF: 4.506
2024-05-07
Medical Physics
Abstract:Background Preclinical low‐count positron emission tomography (LC‐PET) imaging offers numerous advantages such as facilitating imaging logistics, enabling longitudinal studies of long‐ and short‐lived isotopes as well as increasing scanner throughput. However, LC‐PET is characterized by reduced photon‐count levels resulting in low signal‐to‐noise ratio (SNR), segmentation difficulties, and quantification uncertainties. Purpose We developed and evaluated a novel deep‐learning (DL) architecture—Attention based Residual‐Dilated Net (ARD‐Net)—to generate standard‐count PET (SC‐PET) images from LC‐PET images. The performance of the ARD‐Net framework was evaluated for numerous low count realizations using fidelity‐based qualitative metrics, task‐based segmentation, and quantitative metrics. Method Patient Derived tumor Xenograft (PDX) with tumors implanted in the mammary fat‐pad were subjected to preclinical [18F]‐Fluorodeoxyglucose (FDG)‐PET/CT imaging. SC‐PET images were derived from a 10 min static FDG‐PET acquisition, 50 min post administration of FDG, and were resampled to generate four distinct LC‐PET realizations corresponding to 10%, 5%, 1.6%, and 0.8% of SC‐PET count‐level. ARD‐Net was trained and optimized using 48 preclinical FDG‐PET datasets, while 16 datasets were utilized to assess performance. Further, the performance of ARD‐Net was benchmarked against two leading DL‐based methods (Residual UNet, RU‐Net; and Dilated Network, D‐Net) and non‐DL methods (Non‐Local Means, NLM; and Block Matching 3D Filtering, BM3D). The performance of the framework was evaluated using traditional fidelity‐based image quality metrics such as Structural Similarity Index Metric (SSIM) and Normalized Root Mean Square Error (NRMSE), as well as human observer‐based tumor segmentation performance (Dice Score and volume bias) and quantitative analysis of Standardized Uptake Value (SUV) measurements. Additionally, radiomics‐derived features were utilized as a measure of quality assurance (QA) in comparison to true SC‐PET. Finally, a performance ensemble score (EPS) was developed by integrating fidelity‐based and task‐based metrics. Concordance Correlation Coefficient (CCC) was utilized to determine concordance between measures. The non‐parametric Friedman Test with Bonferroni correction was used to compare the performance of ARD‐Net against benchmarked methods with significance at adjusted p‐value ≤0.01. Results ARD‐Net‐generated SC‐PET images exhibited significantly better (p ≤ 0.01 post Bonferroni correction) overall image fidelity scores in terms of SSIM and NRMSE at majority of photon‐count levels compared to benchmarked DL and non‐DL methods. In terms of task‐based quantitative accuracy evaluated by SUVMean and SUVPeak, ARD‐Net exhibited less than 5% median absolute bias for SUVMean compared to true SC‐PET and lower degree of variability compared to benchmarked DL and non‐DL based methods in generating SC‐PET. Additionally, ARD‐Net‐generated SC‐PET images displayed higher degree of concordance to SC‐PET images in terms of radiomics features compared to non‐DL and other DL approaches. Finally, the ensemble score suggested that ARD‐Net exhibited significantly superior performance compared to benchmarked algorithms (p ≤ 0.01 post Bonferroni correction). Conclusion ARD‐Net provides a robust framework to generate SC‐PET from LC‐PET images. ARD‐Net generated SC‐PET images exhibited superior performance compared other DL and non‐DL approaches in terms of image‐fidelity based metrics, task‐based segmentation metrics, and minimal bias in terms of task‐based quantification performance for preclinical PET imaging.
radiology, nuclear medicine & medical imaging
What problem does this paper attempt to address?