Skip to main content.

How much weight should health news give to statistically nonsignificant results?

How much weight should health news give to statistically nonsignificant results?

Picture of Kevin Lomangino
Photo: Justin Sullivan/Getty Images
Photo: Justin Sullivan/Getty Images

The big story on our health news radar yesterday was about a gene test that identifies women with early-stage breast cancer who may be able to safely skip chemotherapy. In a large randomized study published in the New England Journal of Medicine, women who skipped chemotherapy based on results from the “mammaprint” genomic test had a low rate of recurrence and their survival rate at 5 years was about the same as those women who did receive chemotherapy.

“About the same” leaves plenty of wiggle room, however, and it’s interesting to see how different news outlets chose to frame the small difference between the groups.

The study itself reported that patients who received chemotherapy had a 5-year survival rate without metastatic disease of 95.9% compared with 94.4% for the no-chemotherapy group.”The absolute difference in this survival rate between [no-chemotherapy] patients and those who received chemotherapy was 1.5 percentage points, with the rate being lower without chemotherapy,”  according to the study abstract.

Interestingly, there’s no mention in the abstract of the fact this result was not statistically significant, nor is there presentation of confidence intervals or p-values that would allow savvy readers to ascertain the lack of significance. One has to wade far into the discussion section to find this statement: “The study was not powered to assess the statistical significance of these differences.”

If the study wasn’t powered to assess the statistical significance of these differences, should the researchers have called out those differences in the abstract without any qualification? Especially when that difference is small enough, potentially, to represent little more than statistical noise?

I’m asking an earnest question — not a rhetorical one. The concept of statistical significance is widely misunderstood even by experts, such that the American Statistical Association issued a statement earlier this year clarifying the term and one of its main measures — the p-value. And the effort to develop that statement was apparently quite acrimonious.

I don’t pretend to be expert enough to understand how much weight should be given to this result by women making life and death decisions about their health care. But I do know that among smart, evidence-based health care professionals, highlighting a non-significant result in a study abstract — without at least qualifying the result as non-significant — can sometimes be viewed as misleading. And as a journalist, I’m certain that the emphasis placed on this result in the abstract likely led to its receiving more coverage in news stories about the study than otherwise would have been the case.

I wonder if that emphasis was appropriate and how best to capture the complexities involved.

All of the four stories that I looked at related to this study called attention to the uncertainty surrounding this figure but with varying degrees of emphasis.

The Washington Post most directly calls out the issue by paraphrasing comments from one of the study co-authors: “She said the 1.5 percentage-point difference in survival rates between women who got chemo and the ones who didn’t was not statistically significant, especially considering the side effects of chemo, which can include fatigue, cognitive impairment and a prolonged disruption in schedule.”

But the Post also rebuts this statement with comments from a Duke oncologist who suggests that the result, while not statistically significant, might well be “significant” to the patient. “But Paul Kelly Marcom, a breast cancer oncologist at Duke Cancer Institute, said that whether that survival-rate difference is significant is a personal decision by a woman and her doctor.”

The New York Times addresses the issue more gently, noting that editorialists “said that the study was not large enough to be sure that the 1.5-percentage-point difference would hold up statistically.”

Ditto for NPR, which suggested that the 1.5% figure might not be as quite as precise as it sounds. It quotes an American Society of Clinical Oncology official who says, “It’s possible that the benefit is zero, and it’s possible that’s 2 percent or maybe even a little more, you can’t be sure.”

STAT, meanwhile, sidesteps any direct mention of statistical significance it its coverage. It quoted an oncologist who claimed there “was, indeed, ‘a small advantage from chemotherapy’ even when the genetic test suggested low risk and therefore no need for chemo.” It then hedges on that statement a bit with a quote from the accompanying editorial, but it never explains outright that the result was not significant: “A difference of 1.5 percentage points, if real, might mean more to one patient than to another,” oncologists Dr. Clifford Hudis and Dr. Maura Dickler of Memorial Sloan Kettering Cancer Center wrote in an editorial accompanying the study in the New England Journal of Medicine. A benefit of “only” 1.5 percentage points is something “that clinicians and patients might find meaningful.”

Here’s what the experts I contacted had to say about this result and how it was covered.

Susan Wei, PhD,  an assistant professor in the Division of Biostatistics at the University of Minnesota said she was disappointed that none of the news outlets discussed the magnitude of the p-value for the 1.5% difference. “We know it’s insignificant, but the more important question is ‘how insignficant?'” she said.

We tend to think of statistical analysis as giving results that are either significant or insignificant. We often forget that there is a continuum that we, rather arbitrarily, chose to dichotomize into a binary category. If the p-value associated with the 1.5% effect was, say, 0.08, the NEJM protocol would deem this insignificant. But in reality, the difference between a 0.08 p-value and a “significant’ 0.05 p-value is really a matter of personal taste. However, in this case, the reality is that the p-value associated to the 1.5% effect is 0.27. That is pretty highly insignificant.

Rebecca Goldin, PhD, a professor of mathematical sciences at George Mason University and the director of STATS.org, agreed that it was unusual for NEJM to highlight a nonsignificant result in the study abstract without explicitly noting the lack of statistical significance. She also thought the emphasis on this result likely caused journalists to take a greater interest in it than they otherwise would have. But she wouldn’t go as far as to say that journalists should have ignored the result because it wasn’t statistically significant. And she thought that the outlets overall did a reasonably good job of describing the uncertainty around the finding.

Although the data suggest it’s unlikely that there’s a large difference in survival between the groups, it’s possible that a larger sample would provide evidence as to whether the difference is real. We can’t say either way based on this data. And so if I were a clinician advising a patient about this, I’d consider the findings relevant to someone who was very aggressively in favor of chemotherapy in order to maximize their chances of survival. I wouldn’t have a basis to say, ‘No, you’re wrong to do this’ if they insisted on chemotherapy. Then again, for what’s probably the larger group of patients who might be concerned about the adverse effects of chemo it’s actually pretty neat — the findings suggest that there’s no reason to recommend chemo because there’s no solid evidence pointing to a survival difference. I think the minor difference in survival rates was the point of the research, and I do think several news outlets got that.

Deanna Attai, MD,  assistant clinical professor of surgery at the David Geffen School of Medicine at the University of California Los Angeles and past president of the American Society of Breast Surgeons, called the news coverage overall reasonable. “I think STAT did the best job with the explanations and breaking out the percentages, followed by NYT. I wouldn’t categorize any of them as misleading or poor coverage. Very different than some of the local news clips that I glanced at.” She added:

I don’t think it’s misleading to focus on the 1.5% benefit – it does bring home the point that the benefit from chemotherapy is very small.

I don’t think it was really stressed that the 1.5% was survival without metastatic disease. Overall survival was 1.4% better. However, disease free survival (which includes local recurrence) was 2.8% better in the treated group. We generally think of chemotherapy as impacting the likelihood of metastatic disease, but it certainly plays a role in reducing local recurrence rates as well. Is an ~3% disease free survival difference significant for a patient? The answer will be different for each patient. We don’t really have more than that to guide us as we can’t say whether or not any of these numbers are statistically significant.

Regarding 5 year follow up – yes, longer study is needed. Many of the tumors were higher grade and some were node positive, and you would expect a relatively early recurrence. However, for the most part, these were ER/PR+, Her2- tumors that may have late relapse. I am hopeful that the researchers will report on follow up before 10 years, especially if the survival and the disease free survival rates in the 2 groups separate more.

I think take home points for patients include the following:

We have the ability now to obtain more detailed information about an individual patient’s tumor biology, which can help guide treatment decisions. Our traditional methods of assigning risk such as tumor size, grade, and node status, do not always tell the whole story.

It is very clear that not all patients benefit from chemotherapy. Our old habit of recommending chemotherapy “just to be sure” is not appropriate. The use of genomic tests in appropriate patients is very important to avoid the real dangers of over treatment.

A 1.5% benefit from chemotherapy is very low. But as the study was not powered to determine if this was statistically significant or not, it is up to the patient and physician to determine what is significant to them. Some patients will opt for chemotherapy knowing there is a 1.5% increased likelihood of survival without metastatic disease. Some patients will decline chemotherapy as they might feel the potential benefit is too low considering the side effects, both short and long term.

Studies like these often raise more questions and prompt discussion. It is important to note that we do not yet have the “crystal ball” that will tell an individual patient with absolute certainty that their cancer will recur or not. We are closer than we’ve ever been, but we do not have a test that will predict the outcome with 100% certainty

The test results should be presented as part of a discussion which includes the patient’s preferences, values and concerns – concerns regarding cancer recurrence as well as potential side effects of therapy.

Mandy Stahre, PhD, an epidemiologist at the Washington State Department of Health as well as a breast cancer survivor who has served as a consumer reviewer for the Department of Defense Breast Cancer Research Program, thought that NPR did the best job trying to describe the results because it discussed the issue related to the type of medical tools being used. She pointed to this passage from the NPR piece:

“The genomic test, as precise as it is, offers only probabilities, not absolute guidance. And that’s a lesson that applies to the whole new realm of precision medicine, which is billed as potentially transformative for medical care.”

She added:

People get really excited when a new test comes on the market, but many people don’t understand that new tests don’t always provide a smoking gun. In this case, the new test gives some probabilities. For many people, it’s difficult to put those probabilities into context when you are only thinking about yourself and what you should do. I remember being faced with the same decision of whether or not to do chemotherapy. I had to make the decision myself based on all the evidence I had at the time. I was given several options from various oncologists almost always presented as probabilities – “If you do chemotherapy it will cut your recurrence rate by 50%.” My question was always “50% of what?” for which I usually received various percents. After several weeks of searching the literature and reading different guides, I chose not do go through with chemotherapy (which, by the way, my oncologist agreed with me). My reasoning was that any potential benefits were outweighed by the risks associated with chemotherapy. I’m not sure if I had that new test it would have made my decision any easier. I still would have wanted to do my own search and then take all the information together before forming an opinion.

Readers are welcome to weigh in with a comment or to email us with additional feedback.

This article was originally posted on HealthNewsReview.org and is republished here with permission. Kevin Lomangino is HealthNewsReview.org's managing editor. He tweets at @Klomangino.

Comments

Picture of

It seems to me that the hypotheses in this study is not our typical set-up where the null hypothesis is that there is no difference between the survival rate of the chemo group and the no-chemo group and we want to reject the null hypothesis to show that there is indeed a significant difference in survival rates. Rather, this seems more similar to a non-inferiority trial in clinical trials where we want to show that a new treatment (for instance a less expensive drug) does not have a much lower benefit than an existing treatment. If the new drug is "not inferior" to the existing drug, we would want it to be on the market to help bring down the total cost of treatment. Similarly here, we might want to avoid chemo-therapy (for many reasons such as side effects, cost, etc.) if not using chemo-therapy is "not-inferior" to using chemo-therapy. This is a fairly well established statistical set-up, but it is often assessed with confidence intervals rather than p-values. It appears from the NEJM article that a non inferiority boundary was indeed used. So, perhaps, part of the issue is a lack of understanding of this type of statistical set up.

Leave A Comment

Announcements

Do you have a great idea for a potentially impactful reporting project on a health challenge in California?  Our 2020 Impact Fund can provide financial support and six months of mentoring.

CONNECT WITH THE COMMUNITY

Follow Us

Facebook


Twitter

CHJ Icon
ReportingHealth