Posted by: Jeremy Fox | February 16, 2012

Must-read paper: how to make ANY statistical test come out “significant”

Just make all the usual judgment calls and conduct all the usual “exploratory” analyses that scientists conduct all the time!

The linked paper is the best paper I’ve read in a long time. It’s essential reading for everyone who does science, from undergraduates on up. It’s about experimental psychology, but it applies just as much to ecology, perhaps even more so. It says something I’ve long believed, but says it far better than I ever could have.

One partial solution to the problems identified in this paper is for all of us to adhere a lot more strictly to the rules of good frequentist statistical practice that we all teach, or should teach, our undergraduates. Rules like “decide the experimental design, sampling procedure, and statistical analyses in advance”, “don’t chuck outliers just because they’re ‘outliers'”, “separate exploratory and confirmatory analyses, for instance by dividing the data set in half”, “correct for multiple comparisons”, etc. Those rules exist for a very good reason: to keep us from fooling ourselves. This is not to say that judgment calls can ever be eliminated from statistics–indeed, another one of my favorite statistical papers makes precisely this point. But those judgments need to be grounded in a strong appreciation of the rules of good practice, so that the investigator can decide when or how to violate the rules without compromising the severity of the statistical test.

Basically, what I’m suggesting is that, collectively, our standards about when it’s ok to violate the statistical “rules” may well be far too lax. Of course, if they were less lax, doing science would get a lot harder. Or rather, it would seem to get a lot harder. In fact, doing science that leads to correct, replicable conclusions would remain just as hard as it always has been. It would only seem to get harder because we’d stop taking the easy path of cutting statistical corners. And then justifying the corner cutting by making excuses to ourselves about the messiness of the real world and the impracticality of idealized textbook statistical practice.

The linked paper discusses another solution: to report all judgment calls and exploratory analyses, so that reviewers can evaluate their effects on the conclusions. Sounds like a great idea to me. They also note, correctly, that simply doing Bayesian stats is no solution at all. The paper is emphatically not a demonstration of inherent flaws in frequentist statistics.

Further commentary from Andrew Gelman here.



  1. You can try to escape from a false positive going straight to a false negative. One of my favorite articles about statistics from OIKOS:

    Escaping the Bonferroni iron claw in ecological studies

  2. Thanks for the article! Reminds me of another great article on statistics, but this time focusing on the information theoretic approach. A must read for any ecologist! I particularly enjoyed the discussion of multiple competing models as a fundamental tenet of adaptive management.

    Hobbs & Hilborn 2006. Alternatives to statistical hypothesis testing in ecology: A guide to self teaching. 16:5-19

  3. Very important paper. Most researchers have the moral constitution to not falsify data, but “massaging” the analysis has not really been seen in as harsh of a light. It’s a shame that “p<0.05" has become equated with "important, well-executed science", which tempts people (especially those who have yet to land a job or tenure) to play around with analyses until they get something that looks exciting.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: