Abstract:Sherry et al 1 considered the methodological quality of subgroup analyses reported in 379 oncology trials. The authors explored a number of fundamental problems with these analyses: in forest plots, use of linear rather than logarithmic scales and failure to present overall pooled estimates; failure to conduct and highlight tests of interaction; and readiness to make unwarranted inferences regarding the credibility of postulated subgroup effects. They reported a distressingly high frequency of each of these problems. 1 The study thus adds to an already large body of subgroup literature finding that authors' presentations of forest plots are often suboptimal. It also adds indirect evidence for the often-suspected abuse of subgroup analyses for post hoc data dredging in search of interesting findings 2 : trials that failed to show a significant effect reported more subgroup analyses. 1 The primary message of the study is, therefore, that the methodological limitations of subgroup analyses and misleading inferences in oncology trials remain the same as those that the methods community has been discussing for more than 3 decades. 3 ,4 We will briefly discuss these key limitations in subgroup analyses. In only 10% of the subgroup claims did the primary study authors consider prior evidence or prespecify hypotheses. 1 In failing to do so, they fail to inform their audience whether the claim is consistent with prior knowledge. This information is critical for evaluating the credibility of the authors' subgroup claim. The test of interaction is the single most crucial statistic for subgroup analysis: it tells us the extent to which chance can explain the apparent difference in effect sizes across subgroups. Sherry et al 1 found that only 17% of the trials reported a P value or estimate of interaction, leaving evidence users in the dark regarding the extent to which subgroup differences were compatible with random error (and they mostly are 5 ). The more hypotheses one tests, the more likely one will capitalize on a chance finding and then claim a spurious subgroup effect. With an average number of 9 subgroup analyses per oncology trial, 1 these subgroup analyses run a high risk of being misled—and misleading their audience—by the play of chance. For continuous variables such as age, trial authors set thresholds and reported effects on patients above and below their chosen threshold. A superior alternative would be to examine whether effects differ across the range of the continuous variable. The choice of a single threshold results in a high risk of further capitalizing on the play of chance, especially when choosing a threshold that maximizes apparent difference between groups, and weakening the analysis through discarding the extra information that the continuum provides. And finally, as Sherry et al 1 confirmed, none of the included trials applied credibility criteria for subgroup effects that have been available since the 1990s. 3 ,4,6 We recently refined these criteria in the first formal rigorously developed instrument for judging the credibility of subgroup effects. 7 What can we do to improve the methodological quality of subgroup analyses? Learning from the past, publishing more commentaries, meta-studies, simulation studies, and guidance papers about the challenges and solutions in subgroup analysis, even if they are largely repetitive, seems unlikely to help. One strategy, a systematic greater focus on methods implementation, has thus far failed to attract the attention it might deserve. A small number of pioneering studies have started to identify and better understand barriers to methods implementation 8 -10 identify and test strategies for better methods implementation, 11 and raise the issue of whether the principles of implementation science could work in the methods context. 12 One simple application of implementation science would be to make it easier for investigators to identify methods guidance relevant to their studies. 13 Another would be to improve methods guidance by encouraging those providing such guidance to involve their target audience in ensuring the accessibility of the guidance they provide. 14 Reporting guidelines represent another approach that has demonstrated appreciable—although still perhaps somewhat disappointing—improvement in study design and methods implementation. We might utilize the full potential of reporting guidelines by including more methodological details (eg, items addressing the typical limitations of subgroup analyses we have highlighted) in the Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT) guideline and corresponding items in the Consolidated Standards of Reporting Trials (CONSORT -Abstract Truncated-

Challenges in Subgroup Analysis—Should We Do More About Implementation?

Challenges and Solutions to Pre- and Post-Randomization Subgroup Analyses

Subgroup Analyses in Reporting of Phase III Clinical Trials in Solid Tumors.

Reporting Quality of Social and Psychological Intervention Trials: A Systematic Review of Reporting Guidelines and Trial Publications

Subgroup identification in clinical trials: an overview of available methods and their implementations with R

Description of subgroup reporting in clinical trials of chronic diseases: a meta-epidemiological study

Prespecification of subgroup analyses and examination of treatment-subgroup interactions in cancer individual participant data meta-analyses are suboptimal

Comparing Approaches to Treatment Effect Estimation for Subgroups in Clinical Trials

The CONSORT statement: revised recommendations for improving the quality of reports of parallel group randomized trials

What facilitators and barriers might researchers encounter when using reporting guidelines? Part 1: A thematic synthesis

Improving the Validity of Non-Interventional Comparative Effectiveness Research by Basing Study Design on a Specified Existing Randomised Controlled Trial

Trial Forge Guidance 4: a guideline for reporting the results of randomised Studies Within A Trial (SWATs)

Detecting critical treatment effect bias in small subgroups

Reporting quality of randomised controlled trial abstracts among high-impact general medical journals: a review and analysis

The quality of reports of randomised trials in 2000 and 2006: comparative study of articles indexed in PubMed

Editorial Commentary: Clinical Trial Subgroup Analyses and Investigation of Secondary Outcome Measures Should Be Limited in Number to Avoid False Findings

Quality of reporting of modern randomized controlled trials in medical oncology: a systematic review

Clinical trials with nested subgroups: Analysis, sample size determination and internal pilot studies

Influential methods reports for group-randomized trials and related designs

Clinician's Approach to Advanced Statistical Methods: Win Ratios, Restricted Mean Survival Time, Responder Analyses, and Standardized Mean Differences