Challenges in Subgroup Analysis—Should We Do More About Implementation?

Stefan Schandelmaier,Gordon Guyatt
DOI: https://doi.org/10.1001/jamanetworkopen.2024.3339
2024-03-29
JAMA Network Open
Abstract:Sherry et al 1 considered the methodological quality of subgroup analyses reported in 379 oncology trials. The authors explored a number of fundamental problems with these analyses: in forest plots, use of linear rather than logarithmic scales and failure to present overall pooled estimates; failure to conduct and highlight tests of interaction; and readiness to make unwarranted inferences regarding the credibility of postulated subgroup effects. They reported a distressingly high frequency of each of these problems. 1 The study thus adds to an already large body of subgroup literature finding that authors' presentations of forest plots are often suboptimal. It also adds indirect evidence for the often-suspected abuse of subgroup analyses for post hoc data dredging in search of interesting findings 2 : trials that failed to show a significant effect reported more subgroup analyses. 1 The primary message of the study is, therefore, that the methodological limitations of subgroup analyses and misleading inferences in oncology trials remain the same as those that the methods community has been discussing for more than 3 decades. 3 ,4 We will briefly discuss these key limitations in subgroup analyses. In only 10% of the subgroup claims did the primary study authors consider prior evidence or prespecify hypotheses. 1 In failing to do so, they fail to inform their audience whether the claim is consistent with prior knowledge. This information is critical for evaluating the credibility of the authors' subgroup claim. The test of interaction is the single most crucial statistic for subgroup analysis: it tells us the extent to which chance can explain the apparent difference in effect sizes across subgroups. Sherry et al 1 found that only 17% of the trials reported a P value or estimate of interaction, leaving evidence users in the dark regarding the extent to which subgroup differences were compatible with random error (and they mostly are 5 ). The more hypotheses one tests, the more likely one will capitalize on a chance finding and then claim a spurious subgroup effect. With an average number of 9 subgroup analyses per oncology trial, 1 these subgroup analyses run a high risk of being misled—and misleading their audience—by the play of chance. For continuous variables such as age, trial authors set thresholds and reported effects on patients above and below their chosen threshold. A superior alternative would be to examine whether effects differ across the range of the continuous variable. The choice of a single threshold results in a high risk of further capitalizing on the play of chance, especially when choosing a threshold that maximizes apparent difference between groups, and weakening the analysis through discarding the extra information that the continuum provides. And finally, as Sherry et al 1 confirmed, none of the included trials applied credibility criteria for subgroup effects that have been available since the 1990s. 3 ,4,6 We recently refined these criteria in the first formal rigorously developed instrument for judging the credibility of subgroup effects. 7 What can we do to improve the methodological quality of subgroup analyses? Learning from the past, publishing more commentaries, meta-studies, simulation studies, and guidance papers about the challenges and solutions in subgroup analysis, even if they are largely repetitive, seems unlikely to help. One strategy, a systematic greater focus on methods implementation, has thus far failed to attract the attention it might deserve. A small number of pioneering studies have started to identify and better understand barriers to methods implementation 8 -10 identify and test strategies for better methods implementation, 11 and raise the issue of whether the principles of implementation science could work in the methods context. 12 One simple application of implementation science would be to make it easier for investigators to identify methods guidance relevant to their studies. 13 Another would be to improve methods guidance by encouraging those providing such guidance to involve their target audience in ensuring the accessibility of the guidance they provide. 14 Reporting guidelines represent another approach that has demonstrated appreciable—although still perhaps somewhat disappointing—improvement in study design and methods implementation. We might utilize the full potential of reporting guidelines by including more methodological details (eg, items addressing the typical limitations of subgroup analyses we have highlighted) in the Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT) guideline and corresponding items in the Consolidated Standards of Reporting Trials (CONSORT -Abstract Truncated-
medicine, general & internal
What problem does this paper attempt to address?