By Ed Silverman // February 11th, 2013 // 12:41 pm
Several years ago, the Black Box warnings that were added to antidepressants over suicidal thoughts and behaviors for youngsters caused a backlash, as some suggested the language had pushed physicians and parents to avoid usage when the medications could have done some good. The debate may have slipped from view, but never really ended. A pair of papers published last year, in fact, renewed the controversy, and Glen Spielmans, an associate professor of psychology at Metropolitan State University, recounts why the issue remains fraught with challenges and a recent spat that erupted when an effort was made to critique the papers.
Antidepressants can cause suicidality – suicidal thoughts and behaviors – in children and adolescents. This message has been widely disseminated since October 2004, when the FDA placed a Black Box warning on such medications. The warning was based on findings from placebo-controlled trials, in which kids taking antidepressants had an elevated rate of suicidal thoughts and behaviors (see this). But research led by Dr. Robert Gibbons, professor of biostatistics at the University of Chicago, suggests that this warning is counterproductive, scaring parents and kids away from getting safe and effective antidepressant treatment.
Gibbons was the main author on two papers published in 2012 in psychiatry’s premier journal, Archives of General Psychiatry (which was changed to JAMA Psychiatry last month). One paper examined the potential association between antidepressants and suicidality and the other focused on the efficacy of antidepressants (read here and here). They analyzed data from placebo-controlled trials in youth, adults and in the elderly. Two important conclusions drawn by Gibbons and team: Antidepressants do not actually increase suicidality in youth and are more effective than previously believed for kids. Their findings are based on re-analysis of data from selected placebo-controlled trials which date back several years. Gibbons and others explained the importance of these findings in various mainstream and medical media.
For instance, in a Medscape article, an academic psychiatrist opined in support of the findings: “The black box warning might have prevented the medically appropriate use of antidepressants in teenagers. I am very happy to see this study come out.” The same psychiatrist added in the Lancet that the FDA warning “may have done more harm than good” (look here).
Making sure to weigh in on the Black Box in one of their 2012 papers, Gibbons and team wrote that “these findings should also favor reconsideration of the risk-benefit equation that led to the black box warning for suicidal thinking and antidepressants in children.”
Omitting The Bad News
Yet the two papers only analyzed a small slice of relevant data. In fact, the youth data came from only four studies, all of which used Prozac. But there are at least 15 placebo-controlled studies of eight antidepressants for the treatment of depression in youth (see this).
Gibbons et. al claimed to use data from “all sponsor-conducted randomized controlled trials of fluoxetine (Prozac) and venlafaxine (Effexor).” However, an inspection of the abstracts of their two 2012 papers reveals that no data were included from Effexor studies in youth. Yet a controlled study of Effexor in youth found no efficacy and increased risk of suicidality (look here). Though this post is focused on children and adolescents, it’s worth mentioning that Gibbons et al. also did not include data from a geriatric study that showed no efficacy for either Prozac or Effexor (read this). Omitting negative studies leads to an overly rosy picture of antidepressant efficacy.
In an analysis that combines data across several studies, researchers should provide a list of included studies for the sake of transparency; anyone with a critical eye would want to know where the data came from. The list of included studies in the papers doesn’t list published studies that one can obtain from the published, peer reviewed literature. Rather, proprietary names such as LYAQ and 016 are used instead. Good luck tracking down data from these studies; a few colleagues and I spent hours trying to obtain information online and could not find relevant information for most of these trials.
I eventually learned more about Prozac youth study LYAQ (here it is). Turns out that 1) 99 percent of participants had an ADHD diagnosis, 2) less than half of participants had a diagnosis of major depressive disorder, and 3) duration of treatment with Prozac vs. placebo was about three weeks – only half the six-week minimum stipulated by Gibbons. None of this essential information is provided in the papers. A small bit of inappropriate data might not impact the overall findings but LYAQ included 127 kids taking Prozac, 32 percent of the 393 youth across all four studies who took the drug. How can we trust the conclusions when data from nearly a third of Prozac participants should have been excluded?
Also excluded was a small adolescent trial of Prozac which appears to have found minimal benefit for the drug over placebo. This exclusion was more understandable since depression was measured differently than in other studies, but it still biases their findings. Moreover, there was no mention that the studies included found the lowest placebo response rate among antidepressant studies in youth, making Prozac shine all the brighter (see here). While the placebo response rate was low, it is higher than the 5.7 percent rate claimed by Gibbons et al. My colleagues and I carefully examined these studies and found nothing suggesting a single-digit response rate.
In sum, only four antidepressant trials on kids were included, all of which used Prozac – one of which (LYAQ) was obviously inappropriate – plus a Prozac study which showed little benefit was excluded. Further, negative data regarding Effexor was not included. Data from other antidepressant trials in kids, the vast majority of which found little to no benefit were also not included (look here). To top it off, studies were selected with the lowest placebo response rate and then, somehow, the authors calculated an even lower placebo response rate than described in the previously published versions of the studies. No wonder there were positive results.
Gibbons et al. defined suicidality as a score on a single item from a depression rating scale rather than actual suicide-related events (e.g., cutting wrists, telling study personnel of serious intent to kill oneself). The accuracy of these items in detecting actual suicidality is unproven. Prior research (read here) using actual reports of suicide-related events in all antidepressant trials (not just Prozac) led to the FDA Black Box warning. Yet Gibbons et al. decided that rating scale items not intended as in-depth measures of suicide risk in studies not designed to assess suicidality made for a reasonable assessment of what they call “suicide risk.” They also collected information on suicide-related adverse events reported to study personnel but did not report how the rate of these events compared for drug versus placebo. Two senior FDA officials have said that relying on rating scales to assess suicidality is ineffective. More details can be seen here.
Response To Critics
Along with three colleagues (Drs. Jon Jureidini, David Healy and Rob Purssey), I wrote a letter to the editor expressing the above concerns along with a few more. Our letter, and another critical letter from Dr. Bernard Carroll, appeared in the January 2013 issue of JAMA Psychiatry along with Gibbons and team’s responses. Suffice to say that the concerns raised above were not addressed sufficiently. Their response was a non-response.
Below are just a couple of the many points we made and the counterarguments provided:
1. Given that less than half of participants had a diagnosis of major depressive disorder, LYAQ should not have been included in their analysis.
Counterpoint from Gibbons et al: 81 percent of participants had a depressive disorder.
Rejoinder: Actually, 45.7 percent of participants had a diagnosis of major depressive disorder. The 2012 papers clearly state that they were an examination of antidepressant trials in “major depressive disorder.” The authors must also be including participants with diagnoses such as dysthymia and mood disorder not otherwise specified – these are not the same thing as “major depressive disorder.”
2. Effexor has no evidence of efficacy among depressed youth. Failing to include Effexor data paints an unduly optimistic picture of antidepressant efficacy among youth.
Counterpoint: Spielmans and colleagues cite an effect size of 0.14 for youth venlafaxine studies (minimal and non-statistically significant), based on a meta-analysis of endpoints without the benefit of complete longitudinal data as we used.
Rejoinder: The authors claim that their method of analyzing data is better than what has been done in the past. But there is no reason to suspect that by re-analyzing existing data, Effexor (or other antidepressants such as Paxil) would somehow become resoundingly effective for depression in children/adolescents.
A rundown of a) all the points we made along with b) the counterpoints from Gibbons et al and c) a response to their responses is available here.
Back to the Black Box
Gibbons has been discussing the evils of the Black Box warning for years (see The Los Angeles Times and WebMD, for instance) and he’s certainly entitled to his opinion. Yet in their response to our letter, Gibbons and team acknowledged their findings regarding Prozac do not necessarily generalize to other antidepressants; in other words, they are not commenting on the Black Box warning. But in one of the 2012 papers they wrote: “These findings should also favor reconsideration of the risk-benefit equation that led to the black box warning for suicidal thinking and antidepressants in children.” There is no logical way to reconcile those two statements.
My team, Carroll, and Dr. Mickey Nardo all sent letters to the editor regarding the Gibbons papers. All letters cleared the peer review process in April 2012. Despite no concerns being raised by reviewers, the editor offered to publish our letters in a place best described as Scientific Purgatory. The letters would have been published in an online repository hidden behind the journal’s paywall and not indexed in any scientific database such as Medline. In other words, they would not be part of the official scientific record.
Science is meant to be self-correcting; flaws identified by readers should be noted in the official scientific record, not swept under a rug. None of us accepted the editor’s offer, and he did not change his decision. Nardo posted his letter to the journal on his own blog (see here).
However, two additional research teams (Sonia Swanson et al. and Sparks & Duncan) accepted the offer, with their comments appearing online in June and July 2012. One concern noted that there was a trend toward increased suicide rating scale scores for kids taking antidepressants vs. placebo. Gibbons and team responded in October 2012, without a good explanation for how a trend toward increased suicide rating scale scores for youth reconciles with their claim of no increased risk. A back and forth between Swanson’s team and Gibbons appeared in Medscape (story here).
Imagine my surprise when I received a voice mail in October 2012 indicating that my letter to the editor was being prepared for print publication after all. Dr. Carroll also was pleasantly surprised to receive a curt e-mail, though it had no explanation for the editor’s reversal (you can read the letters from Spielmans et and Carroll and the replies from Gibbons et al here and here). While we commend the journal for eventually publishing our critiques, this episode speaks to arbitrary and capricious editorial decision making. To this day, the additional critiques offered by Swanson et al. and Sparks & Duncan still reside in Scientific Purgatory.
There are lessons to learn from this episode. Major scientific and medical journals like JAMA Psychiatry have a bully pulpit in policy debates like the Black Bbox warning on antidepressants. When these journals publish such articles, they have a clear responsibility to correct any errors on the record. Peer review should not end when an article is accepted for publication (read this).
This is more than an academic debate. Thousands of psychiatrists and other prescribers have read the 2012 papers. Relying on the reputation of the journal, many will have been convinced that the FDA Black Box warning was based on thin data whereas the authors had the real answers. After all, they used “complete longitudinal person-level data from a large set of published and unpublished studies.” Sounds impressive until you examine what actually happened. This is the type of study that can misinform physicians, patients and policymakers alike.
Let’s conclude with a simple thought experiment: Suppose a paper with similarly obvious issues reached opposite conclusions (antidepressants are ineffective and linked to increased suicidality risk in youth). Would JAMA Psychiatry have published the paper and/or tried to stifle its critics? Just wondering.