Unsound interpretation of findings on a measurement bias in antidepressant drug research

H. Baumeister
DOI: https://doi.org/10.1111/j.1600-0447.2012.01843.x
2012-06-01
Acta Psychiatrica Scandinavica
Abstract:Isacsson and Adler (1) noted that the Hamilton Depression Rating Scale (HDRS) has limited reliability in assessing low levels of depression. The authors draw two conclusions on this assumption: (i) The precision of ratings decreases as the patient improves and (ii) improvements starting at lower levels of depression will be systematically underestimated compared with improvements starting at higher levels of depression severity. It is not a novel fact that the HDRS is an imperfect measure of depression severity. As the HDRS is considered to be the gold standard for the assessment of depression severity, it is, nevertheless, important to underline the deficits in its reliability; thus, I agree with the authors call for better instruments to assess the severity of depression. Where I got lost, however, is how these methodological restrictions of the HDRS lead to the authors conclusion that randomized controlled clinical trials (RCTs) underestimate the efficacy of antidepressants in less severe depression? The results only highlight that the reliability decreases with decreasing levels of depression severity. This could mean both that RCTs underestimate or overestimate the efficacy of antidepressant drugs in less severe depression. Similarly, the fact that improvements starting at lower levels of depression will be less reliable than improvements starting at higher levels simply indicates that we do have a measurement problem in RCTs using the HDRS to assess subthreshold to mild depression [NICE nomenclature of HDRS cut-off scores (2)] respectively mild to moderate depression [American Psychiatric Association (APA) nomenclature (2)]. As both patient groups (verum and placebo) have low levels of depression at baseline and both patient groups might improve over time, the measurement bias tells us nothing about whether the effects of antidepressant drugs are underestimated or overestimated or whether trials using the HDRS for the assessment of lower levels of depression are just faced with a large variance of results because of the measurement bias. Thus, the only finding of the article by Isacsson and Adler (1) is that the HDRS is a rather imperfect gold standard for the assessment of depression severity. How comes that the authors conclude that the efficacy of antidepressants is underestimated? There are three possible explanations I can think of: (i) A lack of statistical competency leading to the misinterpretation of results. However, this seems rather unlikely, as the article is on a high methodological level; (ii) the authors believe in the efficacy of antidepressant drugs and therefore tend to interpret results in favor of their beliefs. Beliefs prior to a study are known to have an impact on posterior interpretations of the findings (3); (iii) M. Adler stated that he has received...fees for speaking from Eli Lilly, Bristol Meyers Squibb, Sanofi-Aventis and for speaking and preparing manuscripts from AstraZeneca, which, too, could explain the unsound interpretation of results in favor of antidepressant drugs. To date, there is no evidence that antidepressant drugs are significantly superior to placebo in patients with subthreshold to mild depression (= APA mild to moderate depression) (4, 5). Consequently, antidepressant drugs should not routinely be used as first-line treatment for (persistent) subthreshold depressive symptoms or mild depression because of the poor risk–benefit ratio (4, 5). Forthcoming RCTs on antidepressant drug treatments based on psychometrically better assessment instruments might change this recommendation, while the study of Isacsson and Adler (1) is inappropriate to do so, despite its full-bodied title.
What problem does this paper attempt to address?