PET-CT for assessing mediastinal lymph node involvement in patients with suspected resectable non-small cell lung cancer
Mia Schmidt-Hansen,David R Baldwin,Elise Hasler,Javier Zamora,Víctor Abraira,Marta Roqué I Figuls
DOI: https://doi.org/10.1002/14651858.CD009519.pub2
2014-11-13
Abstract:Background: A major determinant of treatment offered to patients with non-small cell lung cancer (NSCLC) is their intrathoracic (mediastinal) nodal status. If the disease has not spread to the ipsilateral mediastinal nodes, subcarinal (N2) nodes, or both, and the patient is otherwise considered fit for surgery, resection is often the treatment of choice. Planning the optimal treatment is therefore critically dependent on accurate staging of the disease. PET-CT (positron emission tomography-computed tomography) is a non-invasive staging method of the mediastinum, which is increasingly available and used by lung cancer multidisciplinary teams. Although the non-invasive nature of PET-CT constitutes one of its major advantages, PET-CT may be suboptimal in detecting malignancy in normal-sized lymph nodes and in ruling out malignancy in patients with coexisting inflammatory or infectious diseases. Objectives: To determine the diagnostic accuracy of integrated PET-CT for mediastinal staging of patients with suspected or confirmed NSCLC that is potentially suitable for treatment with curative intent. Search methods: We searched the following databases up to 30 April 2013: The Cochrane Library, MEDLINE via OvidSP (from 1946), Embase via OvidSP (from 1974), PreMEDLINE via OvidSP, OpenGrey, ProQuest Dissertations & Theses, and the trials register www.clinicaltrials.gov. There were no language or publication status restrictions on the search. We also contacted researchers in the field, checked reference lists, and conducted citation searches (with an end-date of 9 July 2013) of relevant studies. Selection criteria: Prospective or retrospective cross-sectional studies that assessed the diagnostic accuracy of integrated PET-CT for diagnosing N2 disease in patients with suspected resectable NSCLC. The studies must have used pathology as the reference standard and reported participants as the unit of analysis. Data collection and analysis: Two authors independently extracted data pertaining to the study characteristics and the number of true and false positives and true and false negatives for the index test, and they independently assessed the quality of the included studies using QUADAS-2. We calculated sensitivity and specificity with 95% confidence intervals (CI) for each study and performed two main analyses based on the criteria for test positivity employed: Activity > background or SUVmax ≥ 2.5 (SUVmax = maximum standardised uptake value), where we fitted a summary receiver operating characteristic (ROC) curve using a hierarchical summary ROC (HSROC) model for each subset of studies. We identified the average operating point on the SROC curve and computed the average sensitivities and specificities. We checked for heterogeneity and examined the robustness of the meta-analyses through sensitivity analyses. Main results: We included 45 studies, and based on the criteria for PET-CT positivity, we categorised the included studies into three groups: Activity > background (18 studies, N = 2823, prevalence of N2 and N3 nodes = 679/2328), SUVmax ≥ 2.5 (12 studies, N = 1656, prevalence of N2 and N3 nodes = 465/1656), and Other/mixed (15 studies, N = 1616, prevalence of N2 to N3 nodes = 400/1616). None of the studies reported (any) adverse events. Under-reporting generally hampered the quality assessment of the studies, and in 30/45 studies, the applicability of the study populations was of high or unclear concern.The summary sensitivity and specificity estimates for the 'Activity > background PET-CT positivity criterion were 77.4% (95% CI 65.3 to 86.1) and 90.1% (95% CI 85.3 to 93.5), respectively, but the accuracy estimates of these studies in ROC space showed a wide prediction region. This indicated high between-study heterogeneity and a relatively large 95% confidence region around the summary value of sensitivity and specificity, denoting a lack of precision. Sensitivity analyses suggested that the overall estimate of sensitivity was especially susceptible to selection bias; reference standard bias; clear definition of test positivity; and to a lesser extent, index test bias and commercial funding bias, with lower combined estimates of sensitivity observed for all the low 'Risk of bias' studies compared with the full analysis.The summary sensitivity and specificity estimates for the SUVmax ≥ 2.5 PET-CT positivity criterion were 81.3% (95% CI 70.2 to 88.9) and 79.4% (95% CI 70 to 86.5), respectively.In this group, the accuracy estimates of these studies in ROC space also showed a very wide prediction region. This indicated very high between-study heterogeneity, and there was a relatively large 95% confidence region around the summary value of sensitivity and specificity, denoting a clear lack of precision. Sensitivity analyses suggested that both overall accuracy estimates were marginally sensitive to flow and timing bias and commercial funding bias, which both lead to slightly lower estimates of sensitivity and specificity.Heterogeneity analyses showed that the accuracy estimates were significantly influenced by country of study origin, percentage of participants with adenocarcinoma, (¹⁸F)-2-fluoro-deoxy-D-glucose (FDG) dose, type of PET-CT scanner, and study size, but not by study design, consecutive recruitment, attenuation correction, year of publication, or tuberculosis incidence rate per 100,000 population. Authors' conclusions: This review has shown that accuracy of PET-CT is insufficient to allow management based on PET-CT alone. The findings therefore support National Institute for Health and Care (formally 'clinical') Excellence (NICE) guidance on this topic, where PET-CT is used to guide clinicians in the next step: either a biopsy or where negative and nodes are small, directly to surgery. The apparent difference between the two main makes of PET-CT scanner is important and may influence the treatment decision in some circumstances. The differences in PET-CT accuracy estimates between scanner makes, NSCLC subtypes, FDG dose, and country of study origin, along with the general variability of results, suggest that all large centres should actively monitor their accuracy. This is so that they can make reliable decisions based on their own results and identify the populations in which PET-CT is of most use or potentially little value.