Posted by: Jeremy Fox | October 11, 2011

Frequentist vs. Bayesian statistics: resources to help you choose (UPDATED)

There are two dominant approaches to statistics. Here, I explain why you need to choose one or the other, and link to resources to help you make your choice.

Most ecologists use the frequentist approach. This approach focuses on P(D|H), the probability of the data, given the hypothesis. That is, this approach treats data as random (if you repeated the study, the data might come out differently), and hypotheses as fixed (the hypothesis is either true or false, and so has a probability of either 1 or 0, you just don’t know for sure which it is). This approach is called frequentist because it’s concerned with the frequency with which one expects to observe the data, given some hypothesis about the world. The P values you see in the “Results” sections of most empirical ecology papers are values of P(D|H), where H is usually some “null” hypothesis.

Bayesian statistical approaches are increasingly common in ecology. Bayesian statistics focuses on P(H|D), the probability of the hypothesis, given the data. That is, this approach treats the data as fixed (these are the only data you have) and hypotheses as random (the hypothesis might be true or false, with some probability between 0 and 1). This approach is called Bayesian because you need to use Bayes’ Theorem to calculate P(H|D).

At a broad-brush verbal level, both these approaches sound eminently reasonable, to the point that differences between them sound subtle to the point of unimportance. A frequentist basically says, “The world is a certain way, but I don’t know how it is. Further, I can’t necessarily tell how the world is just by collecting data, because data are always finite and noisy. So I’ll use statistics to line up the alternative possibilities, and see which ones the data more or less rule out.” A Bayesian basically says, “I don’t know how the world is. All I have to go on is finite data. So I’ll use statistics to infer something from those data about how probable different possible states of the world are.” And indeed, there are contexts in which Bayesian and frequentist statistics easily coexist.

But there are many contexts in which they don’t; frequentist and Bayesian approaches represent deeply conflicting approaches with deeply conflicting goals. Perhaps the deepest and most important conflict has to do with alternative interpretations of what “probability” means. These alternative interpretations arise because it often doesn’t make sense to talk about possible states of the world. For instance, there’s either life on Mars, or there’s not. We don’t know for sure which it is, but we do know for sure that it’s one or the other. So if you insist on trying to put a number on the probability of life on Mars (i.e. the probability that the hypothesis “There is life on Mars” is true), you are forced to drop the frequentist interpretation of probability. A frequentist interprets the word “probability” as meaning “the frequency with which something would happen, in a lengthy series of trials”. The most common alternative interpretation of “probability” (though not the only one) is as “subjective degree of belief”: the probability that you (personally) attach to a hypothesis is a measure of how strongly you (personally) believe that hypothesis. So a frequentist would never say “There’s probably not life on Mars”, unless she was just speaking loosely and using that phrase as shorthand for “The data are inconsistent with the hypothesis of life on Mars”. But the most common sort of Bayesian would say “There’s probably not life on Mars”, not as a loose way of speaking about Mars, but as a literal and precise way of speaking about his beliefs about Mars. A lot of the choice between frequentist and Bayesian statistics comes down to whether you think science should comprise statements about the world, or statements about our beliefs.

I’m a frequentist. But lots of very smart people aren’t. This post isn’t an argument for or against either philosophy. It’s just to alert you that this philosophical conflict exists, that it is very deep, and that you, as a working scientist, need to be familiar with it in order to make an informed choice of statistical approach. One thing frequentists and Bayesians agree on is that it’s a bad idea to do “cookbook statistics”, where you just mindlessly choose and follow some statistical “recipe” without worrying about why the recipe works–or even about what it’s trying to cook! I agree with Ellison and Dennis (2010) that ecologists should be “statistically fluent”, although I disagree with them that taking calculus-based technical courses in statistics is the only way to achieve fluency. Note that “fluency” is not at all the same thing as “technical proficiency”. If anything, I think one unfortunate side effect of the increasing popularity of technically-sophisticated, computationally-intensive statistical approaches in ecology has been to make ecologists even more reluctant to engage with philosophical issues–i.e. less fluent, or else less likely to care about fluency. It seems like there’s a “shut up and calculate the numbers” ethos developing, as if technical proficiency with programming could substitute for thinking about what the numbers mean. Lee Smolin noted a similar trend in fundamental physics.

Unfortunately, even advanced stats textbooks aimed at ecologists mostly don’t bother with more than the most cursory philosophical remarks. For instance, Clark (2007) spends only two pages on philosophy of statistics. And he uses those two pages to argue for the irrelevance of statistical philosophy to the real world scientist, because longstanding philosophical debates show no sign of definitive resolution! As I’ve noted elsewhere, this is a terrible argument for “pragmatism”, analogous to arguing that debates between liberal and conservative political philosophies are longstanding, and therefore irrelevant to the real world voter. Bolker (2008) is an admirable exception to this general reluctance of ecological statistics textbooks to grapple with conceptual issues.

So below is some food for thought, a compilation of some interesting and provocative writings I’ve found really helpful in developing my own philosophy of statistics. I encourage you to dip into them.

Note that most of the items I’ve listed assume some basic familiarity with different statistical philosophies, beyond the very brief sketch I gave above. Unfortunately, I have yet to find a really good, freely available, non-technical introduction to alternative philosophies of statistics, pitched at a level suitable for any professional ecologist or grad student. The discussion in Bolker (2008) is the sort of thing I’m thinking of, but it’s part of a book that costs money. Anyone know of anything good?

Books:

Error and the Growth of Experimental Knowledge by Deborah Mayo. Great defense of frequentist statistics as part of a broader philosophy of science, and a great compilation and debunking of the (often jaw-dropping) criticisms of frequentist statistics by Bayesians. I suspect a lot of scientists who consider themselves Bayesians, or who use Bayesian methods without really worrying about the philosophy most commonly used to justify those methods, may be rather shocked to discover just what sort of philosophy they’ve gotten into bed with. Even if you don’t buy Mayo’s argument for frequentist statistics and against Bayesianism, you ought to engage with her broader argument that one’s choice of statistical philosophy should be dictated by one’s philosophy of science.

The Nature of Scientific Evidence, Mark Taper & Subhash Lele (eds). Great series of chapters (many followed by critiques and rejoinders) covering a range of issues to do with philosophy of statistics and learning from evidence more generally. The author list is an all-star collection of ecologists, statisticians, and philosophers.

Articles:

Population ecologist Brian Dennis’ (1996) polemic on why ecologists shouldn’t be Bayesians is essential reading, and a lot of fun. Think philosophy of statistics is abstruse, technical, or dry? Read this, then think again. Think choice of statistical philosophy has no real-world consequences? Read this, then think again.

My fellow Canadian Subhash Lele has done a lot of very original work in statistical methods, much of it showing how to use frequentist methods to do things that frequentists supposedly can’t do, such as fit complex hierarchical models or incorporate prior knowledge and expert opinion.

Statistician Brad Efron invented bootstrapping. So, he’s a smart guy. So you should probably be interested in what he has to say about things like the challenges modern-day “Big Data” raises for both Bayesian and frequentist approaches, and about some points of commonality between frequentists and Bayesians. He’s one of those people whose off-the-cuff remarks are more incisive and interesting than most people’s most rigorous efforts.

Blogs:

Statistical Modeling, Causal Inference, and Social Science. Statistician and social scientist Andrew Gelman’s blog. He’s an entertaining, breezy writer, and he writes about all kinds of stuff, from highly technical statistical problems sent to him by readers, to plagiarism in science, to US politics, to inferring causality from non-experimental data, to philosophy of statistics. He’s an unusually thoughtful pragmatic “Bayesian” (I put that in quotes because he’s so unorthodox and frequentist in his Bayesianism that one can probably question whether he’s really Bayesian at all). If you insist on being a pragmatist, then you at least ought to be the sort of pragmatist Andrew Gelman is. Be a pragmatist because you have a deep understanding of and appreciation for alternative philosophies, not out of ignorance of alternative philosophies, or because you think philosophy doesn’t matter. For a sampling of his voluminous work on the philosophy of Bayesian statistics, see here, here, here, here, here, here, here, and here (yes, that’s only a sampling!)

Error Statistics Philosophy. Philosopher Deborah Mayo (see above) just started a blog! Unfortunately, it is the ugliest blog in the world (P<0.001), but don’t let that stop you. Mayo pulls no punches, and she grounds her philosophical discussions in practical, real-world examples. Her blog has links to many of her articles, and it helps to read one or two of them before diving into her blog, as she tends to assume more familiarity with the issues than does Gelman.

UPDATE: Deborah Mayo herself has added some comments, in particular on the importance of seeing frequentism as an approach to “error statistics” (briefly, the view that it’s the job of statistics to help us root out, and rule out, sources of error).


Responses

  1. Thanks for the links and thoughts. Bolker’s website for his book (http://www.math.mcmaster.ca/~bolker/emdbook/index.html) has some good free resources that I have been reading.

    I like your distinction between fluency and proficiency. Few ecologists will be better than some statisticians at what they do, but we need to be able to communicate to make advances. Luckily, more and more ecologists are creating courses on these topics (http://emdbolker.wikidot.com/emd-courses)!

    • Thanks Aaron!

      Re: the proliferation of advanced stats courses for ecologists and “conceptual or philosophical fluency” vs. “technical proficiency”, I think much depends on exactly what’s taught in those courses. Frankly, I suspect most of these courses focus on technical proficiency. For instance, some of the courses you linked to are taught by Jim Clark. If Jim’s courses are anything like Jim’s textbook, they won’t bother with the conceptual issues I’ve raised (not that Jim’s uninterested in conceptual issues, but his pet issues are things like model dimensionality; my pet issues are more fundamental).

      To be clear, my post is not meant to be an argument against the value of technical proficiency. I just don’t want to see technical proficiency implicitly or explicitly treated as a substitute for conceptual fluency. All of the conceptual issues between frequentists and Bayesians can be stated very non-technically.

      Deborah Mayo has a new edited volume that just came out (my copy is on order), comprising a series of debates or exchanges on these issues. I imagine that this book could be used as the basis of an interesting graduate seminar course or reading group. Same for the Taper & Lele book.

    • I just discovered this blog for the first time, looking at stats of my own. I’m sorry I didn’t find it earlier. I’ve a lot to react to, staring from the definitions of “frequentist statistics”. I think you’ll find a very different notion in my work and blog It further argues for a new terms like “error statistical”. The post above says: “This approach is called frequentist because it’s concerned with the frequency with which one expects to observe the data, given some hypothesis about the world.” But that’s not the key thing: everyone uses likelihoods. It’s rather that (a) probability is used in inference by quantifying the relative frequency with which methods would detect discrepancies and flaws of various sorts and (b) these error probabilistic properties may (if correctly applied) provide the means to evaluate how stringently or severely different hypotheses have been tested. There’s a lot more I’d want to say…it will have to wait, but thanks! Are there many philosophers involved with your journal?

      • Wow, I’m sincerely flattered that *the* Deborah Mayo found her way here! I read you book Error and the Growth of Experimental Knowledge in grad school at the suggestion of my external examiner, and it really opened my eyes and had a big influence on me, as I’ve (briefly) described in another post.

        This post was intended as the barest of sketches of (a few of) the issues in philosophy of statistics, purely to encourage my readers (many of whom will be unfamiliar with these issues) to click through to the far more thorough sources I linked to. In sketching the frequentist approach in the way I did, I may not have sketched in the best way, and I certainly didn’t mean to imply that frequentists don’t care about likelihoods (I certainly do care about them!). I was mostly trying to highlight the contrast with the personal subjectivism of the leading Bayesian school, and so didn’t get into how the probabilities frequentists care about are best thought of as error probabilities.

        Given that our journal is an ecology journal, I don’t think there are any philosophers involved in it! I certainly wouldn’t call myself a philosopher, although I’ve probably read a bit more widely in philosophy than the average ecologist, and have perhaps had a bit more interaction with proper philosophers. On occasion, I’ve used this blog to think out loud about conceptual issues in ecology which may shade into the philosophical (for instance, see here, here, and here). The Oikos journal has something of a tradition of offering an outlet within ecology for papers addressing these sorts of conceptual issues. But I admit I’m a little scared of what a proper philosopher would think of my modest efforts on these lines, or those of most other ecologists! One thing my limited reading of the philosophical literature has convinced me of is that many ‘philosophical’ papers written by ecologists, including my own blog posts, have serious shortcomings (including, but not limited to, very weak links to the often-extensive philosophical literature on the topic being addressed). But if this blog prompts some ecologists to read and think a bit more philosophically than they otherwise would have, I’ll consider that a success.

      • p.s. Sorry for the snarky aside on the appearance of the Error Statistics Blog. When writing on the Oikos blog, I’m snarky about various things I perhaps shouldn’t be. Esp. since I’ve never actually designed the layout of a blog myself (this blog’s layout is just one of the standardized WordPress options).

  2. I taught “p value” statistics for many years, until I finally decided it was a disservice to my students and to science. As readers of your blog know, any fixed set of data has many different p values, depending on things like why the researcher decided to stop data collection, or how many other groups the researcher intends to eventually compare with, even though those intentions have no influence on the data. Two labs can get identical data but have very different p values and very different conclusions. One person with a single fixed set of data can have different p values on different days of analysis. For me, it’s not a matter of deep philosophy, it’s simply that p values make no sense. To base scientific inference on p values is embarrassing. I have yet to encounter an argument that makes me prefer frequentist to Bayesian data analysis. Bayesian data analysis gives me richer information and with greater flexibility. I’m never going back to “p value” statistics.

    • Always good to hear from commenters with whom I *deeply* disagree! (seriously) The point of the post was to raise the issues; the stark contrast between your views and my own dramatically illustrates that there are real issues here that are worth thinking about. And yes, your views do contrast sharply with my own–contrary to your remark about what readers of this blog know, I don’t think I’ve ever given readers of this blog the impression that I think p-values are just arbitrary–at least I hope I haven’t! 😉

      I suspect you’re already familiar with the arguments I might make to defend my frequentist stance, plus that would make for a really long reply, so I won’t attempt an argument for the merits of my position. Suffice to say I’m familiar with the points you briefly raise, and that I disagree with all of them.

      I will note, with all due respect and mainly for the benefit of readers who are new to this debate, that you couch your comments in very strong rhetoric. You state that it’s not a matter of “deep philosophy”, that “p values simply make no sense”, and that inference based on p values is “embarrassing”. Such rhetoric raises the question of how it could be that so many scientists, apparently very smart, competent, well-trained, and well-meaning, faced with strong incentives use the best methods available, and familiar with arguments for the Bayesian approach, could nevertheless be so stupid as to make such an elementary methodological mistake. Such rhetoric is not unique to you, or to Bayesians, of course–Deborah Mayo and Brian Dennis, to name just two, make a lot of sharply-worded comments about subjective Bayesian statistics, so sharp that one wonders how those who subscribe to subjective Bayesianism could possibly be such morons. I’m on record as finding rough and tumble debate both fun and valuable, but even I find that the rhetoric in the Bayesians vs. frequentists debate is probably rather too strong, rather too much of the time. Overly strong rhetoric makes it very difficult for even well-meaning disputants to distinguish between attacks on their views, and personal attacks, and makes it difficult to appreciate hybrid approaches like Andrew Gelman’s, or points of commonality like those discussed by Brad Efron. Whenever I post a really strong attack on some idea–as in my attack on the “zombie idea” of the IDH–I try to talk about not just how wrong the idea is, but also why it’s so appealing and how it could be that a lot of very smart people would believe it. Hopefully that has the effect of keeping the debate productive, by encouraging those who hold the view I’m attacking to take seriously what I have to say, and signalling to them that I respect them and take seriously what they have to say. You and I seriously disagree–but I’m not a moron, and neither are you. In fact, we’re both very smart and thoughtful–and not just smart and thoughtful in general, but smart and thoughtful about the philosophy of statistics. That doesn’t mean we can’t seriously disagree, and say so. But no matter how seriously mistaken you think I am about statistics, my mistakes are not embarrassingly elementary. And no matter how seriously mistaken I think you are about statistics, your mistakes are not embarrassingly elementary. We should probably both avoid saying or implying otherwise.

      • The remarks in EGEK concerned subjective Bayesian philosophers, not statisticians. The thing that many people don’t always realize is that the attacks from the subjective Bayesian side (Savage, Lindley on) have been so over-the-top, so unfair, so meanspirited and relentless and threatening, that it immediately raises the level of responses. Look for example at the criticisms I raise in my blog—they are howlers that no sensible user of the methods would commit. And I guarantee that that is so for the complaints raised above about p-values. What alters error probabilities alters p-values, I’ve written on this for years and have received no earnest response (e.g., to why stopping rules and selection effects matter to a test’s error probabilities. By and large, frequentist statisticians have run the other way when it comes to foundational dedebateissuesdaforareasontreasonorder to avoid such order to avoid the kind of cutting challenges of the Savage-style subjective Bayesian.

      • Thanks for your further comments, Deborah, it’s good to have a participant in these intense philosophical debates provide some perspective on why those debates have taken the tone that they have.

        Although as to whether any sensible user of statistical methods would commit to the howlers of subjectivist Bayesian philosophers, a commenter above is a user (an experimental psychologist), and does seem to subscribe to some howlers regarding p-values…

    • @Jeremy, once again a really interesting, useful post (thanks for the links, some great reading material while my computer thinks slow and hard about maths). I have a vague knowledge of the differences, and boning up on Bayesian things is definitely on my list of ‘to do when I get a month to think carefully about stuff’.

      @John, I’d say that the way some researchers use and abuse p-values is (probably embarrassing and) a more reasonable way of reflecting what’s going on. As Bayesian approaches start to become more commonly taught in undergrad ecology/science courses (as commented below), I expect to see similar abuses creeping into the published literature after some lag time – if it’s not already there.

      I’d recommend Anderson’s (2008) Model based inference in the life sciences. Although it doesn’t offer particular insight into the F-B debate, it offers some great advice on statistical philosophy. To paraphrase:

      Think long and hard about what, how and why you want to test (a) certain hypothesis(es). Now think a bit longer and harder before you start.”

  3. Hello,

    Coming more from a “naturalist” background (vegetation science), working in the field of restoration ecology and on questions of land managment in general (in an European/Alpine context, working in Austria), I see my work(flow) to bring data “home” to have something to test models (ok, that sounds a little bit naive…). Therefore, the discussion “frequentist vs. bayesian” is for me deeply connected with the specific topic one works with, similar to the also already mentioned “naturalist vs. modeller” approach.
    Not being able to have a clear null hypothesis is not a sign of bad science, it is a sign for much exciting work ahead, of an “undiscovered continent”. To tame the dragons, different approaches are necessary than when there is at least a rough map already available… A little more realistic: I think there is no definite answer which one is the one and true methodology, it depends on the questions you are working with, which is shown in my opinion very well by this long discussion without any definite result.
    A book which is already a little bit older (1997) but gives good food for thought and I think is missing in your list of your post is “The Ecological Detective: Confronting Models with Data, by Ray Hilborn & Marc Mangel
    They also confront philosphical approaches with statistical ones in the introductory chapters, showing that our statistical view of the world frame our scientific attitude…

    http://press.princeton.edu/titles/5987.html
    http://books.google.com/books/about/The_ecological_detective.html?id=katmvQDi8PMC

    Albin

    • Yes, I was remiss not to mention Hilborn and Mangel; just slipped my mind.

  4. First, thanks for a nicely written post on stats philosophy.

    I disagree with your very first statement that “you need to choose one or the other,” which I interpret to mean “you need to become a frequentist or a Bayesian.” As you demonstrate later in your post, which approach to take may well depend on the question being asked. I might well take a frequentist approach to analyzing a beautifully crafted null-hypothesis-rejecting experiment, but a Bayesian approach to a messy real-world conservation modeling problem. In fact, my dissertation will likely include both types of statistics, *with* the understanding of the underlying philosophical issues. (I wouldn’t necessarily say I’m fully fluent right now, but I would say I’m conversational.)

    Two other points to make:
    1) I think one major reason fluency is so low is that if one takes a frequentist stats class, there is little to no mention of Bayesian approaches. And if one takes a Bayesian stats class, all one hears about is the superiority of Bayesian stats to frequentist stats. How is the average student supposed to sort any of this out?

    2) While the philosophy of stats is important, I think it’s also important to point out that changes in technology may influence the use of different types of statistics and that this is not bad. Because of the increase in computational power, it is now possible to do some Bayesian statistics that couldn’t have been done 10 or 20 years ago. Ecologists tend to be frequentists because of the legacy of having done frequentist stats in the past. So even if a particular question is better addressed with a Bayesian approach, there is still a tendency in Ecology to try a frequentist approach first, because it feels more familiar. I think we’re going to have this frequentist legacy for a long time to come.

    • Thanks Margaret. Re: the need to choose, you’ve detected a tension in the post that reflects a tension in my own thought. Until recently, I would’ve insisted (like Brian Dennis) that you do need to choose, full stop, and that “pragmatism” was just another word for “unprincipled and ad hoc”. It’s only in the last little while, primarily due to reading Gelman, that I’ve started to wonder if there might be such a thing as pragmatism that is in some sense principled, and that is grounded in “fluency” rather than lack of fluency.

      On a related note, which I only touched on briefly in passing, is that, when it makes sense to talk about probabilistic states of the world, you can be a Bayesian while retaining a frequentist interpretation of “probability”. In such cases, I have no problem with Bayesian stats, and indeed don’t see any really fundamental difference between frequentism and Bayesianism in such cases. Mayo gives the hypothetical example of a two-stage game of chance (I may be misremembering the details, but this will give you the gist of it). In the first stage, you guess whether a flipped coin will come up heads or tails. If you guess right, in the second stage you get to guess whether a card pulled at random from a deck will be red or black. If you guess the coin flip wrong, in the second stage you must guess the suit of the card. Now, if I tell you that I played this game, and that I won, and ask you to calculate the probability that I guessed the coin flip correctly, how do you do it? The answer is to use Bayes’ Theorem. The prior probability that I guessed the coin flip correctly is 0.5; this is (by assumption) a perfectly objective, frequentist probability. Given the information that I won the game, you can use Bayes’ Theorem to update that prior and calculate the posterior probability that I guessed the coin flip correctly. This posterior too can be interpreted as a perfectly objective, frequentist probability–it’s the expected proportion of wins in which I guess the coin flip correctly. A lot of applications of Bayesian stats in ecology are actually analogous to this two-stage game, although it’s often not obvious.

      I’m sure you’re right that fluency with alternative approaches is low in large part because our stats courses teach one philosophy or the other, often without even noting the existence of the other. That was certainly the case for my stats classes. One way for the student to deal with this is to read my blog post, and then dip into some of the writings I linked to! 😉 That’s what I did–my fluency with philosophy of statistics is self-taught.

      Yes, changes in technology do drive changes in statistics, and no, that’s not necessarily bad. And changes in technology don’t necessarily favor Bayesian stats. Computers don’t just let us do (say) MCMC to fit Bayesian models, they also let us do bootstrapping, which is frequentist.

  5. Very nice post in all respects, including the comments and your replies. Thanks for taking the time to write this.

    Was it Dennis who said something like “Bayesianism means never having to say you were wrong”? I think I read that in Bolker’s book, which by the way, I would highly recommend, cost or not. It’s not just about statistics by any means–it’s a thoroughgoing discussion of ecological modeling in general. Idiots such as myself can learn a lot from that book. Potentially 🙂

    After reading your description, I’m starting to wonder if I’m actually Bayesian in outlook, when all along I’ve considered myself to be frequentist. That is, I’m, apparently, frequently a Bayesian (that joke was stolen!). But more to the point, there’s a (large) part of me that doesn’t really see the big difference between the two, and thinks this is largely a contrived dichotomy, almost purely semantically driven.

    It seems to me that Bayesians are fine when they think in terms of the likelihoods of competing hypotheses being true, no problem there that I can see. But when they then interpret the evidence for these as “degrees of belief” this makes no sense to me. Or else it’s tautological: yes they are degrees of belief, but they are derived from the empirical data: the beliefs have an empirical basis, likelihoods derived from evaluations of frequencies! The Bayesian calls it “degrees of belief”, while I call it “likelihood of veracity of explanation” or something like that. What really is the difference exactly? I fail to see it.

    • Thanks Jim.

      Yes, you’ve quoted Brian Dennis correctly, or very close to correctly, and yes, he is quoted in Ben Bolker’s book. I believe the quote is from the paper of Brian’s that I linked to.

  6. To say “Bayesianism means never having to say you were wrong” is as nonsensical, rhetorical and provocative as simply stating that “it’s not a matter of deep philosophy, it’s simply that p values make no sense”. Indeed, it is wrong. In reality, Bayesianism means not only saying how can you be wrong, but also estimating the amount by which you can be wrong (whatever “wrong” means here, anyway…).

    The problem, as I see it, is that many debates on Bayesianism-vs-frequentist, even within the statistical journals, seem to have a very low profile. For example, the “Bayesian school” purportedly being under attack is never made explicit (Savage-de Finetti?, Bayes-Laplace?, frequentist, quasi-bayes…). This leads to the impression that there is ONE Bayesian school, which is a sad position.

    This debate is full of subtleties that are never made explicit, and they are relevant. Here more than anywhere, the devil is in the details. Simply because most ecologist using “Bayesian statistics” (read “the stuff provided in WinBUGS”) are not even aware of a thing call “Bayes Factors”, for example, does not mean that Bayesianism cannot handle things such as “effect sizes”, “statistical evidence in support of an effect”, “model averaging”, etc. These are things invented by Bayesians! And the fact that you will be able to find just a few Ecological papers performing a power analysis, estimating the size of effects and this kind of things, does not mean that frequentism is completely subjective (Why P<0.05? Why one-tailed test? Why gather data until you find the desired effect?…).

    • The Dennis quote has a context. It is indeed provocative rhetoric, and one with which disagreement is possible (Gelman’s Bayesianism, for instance, certainly allows for saying you were wrong, although then again Gelman himself argues that most Bayesians don’t do frequentist-style error probing the way he does). But it is not a non-sensical quote, as the context makes clear.

      Yes, there are many sorts of Bayesians, including those who claim to be “objective Bayesians”, which my post glossed over in order to remain at a reasonable length. Having said that, I disagree that it’s “never” made explicit what sort of Bayesianism is under attack. You’ve apparently been reading different stuff than I have. In pretty much all of the material I link to it’s clear precisely what sort of Bayesianism is being promoted or attacked, and for what reasons. Even Dennis is really pretty clear on this (he’s attacking the same views attacked by Mayo, who in her book quotes at great length from many named Bayesians; Mayo is a philosopher and so is nothing if not explicit and precise).

      Yes, I agree with you and the other commentators that many ecologists approach both Bayesian and frequentist statistics in a sort of rote or cookbook fashion, and so aren’t fluent with the philosophical issues raised in the post. I hope that this post changes that, at least for the few ecologists who read it (just trying to do what I can).

      A quibble: In passing, you identify frequentism with “gather data until you find the desired effect”. I’m confused: it’s usually Bayesians who argue that this is legitimate (since for them, data you might have gathered, but did not, is irrelevant). Frequentists argue that “gather data until you find the desired effect” is illegitimate, or else that your “stopping rule” must be accounted for in your statistics because, far from being “locked up inside your head” (as many famous Bayesians have claimed), it’s a feature of your sampling procedure that affects your probability of observing certain data.

      • Jeremy: Everything we say has a context. And I disagree, Denis’ quote is indeed nonsensical because it is (always) false, it doesn’t matter the context “supporting” the quote. You cannot say “Bayesianism means never having to say you were wrong”, since Bayesianism never support this. Ok, you can give the impression you’re always right using Bayesian statistics, and you can do that equally well using frequentist statistics. But it is you who’s lying, not the statistics. In my opinion, this is the whole point of this. These sorts of quotes seem constructed for being placed at the introduction to a paper/book chapter/Thesis… by someone who wants to say “Bayesians are morons because someone much wiser than me already said something in the way, so then I am a frequentist because I learned to use R and I feel very limited for winbugs” [Yeap, a caricature,, but take into account the context… :)]. Dennis, probably the cleverest statistical biologist nowadays, made a very poor service to the community by saying this.

        We probably read the same stuff; well, I’m an applied statistician, so maybe my scope is broader (which does not means I am in a better position). “Never” is perhaps too strong a word (recall Dennis’ quote) but, in my view, in this debate it is seldom made explicit who/what you are “attacking”. Yes, Debbie Mayo is usually explicit about the philosophical underpinnings of some of the differing views and its purported problems, but she really stops there (few real-world statistical practice, really). And Brian Dennis does deal with real-world statistical practice, but he’s basically concerned with the “contamination” of the inference by the information in the prior, and in this respect he is neither original nor completely accurate or referring to a specific Bayesian school. It is straightforward to show by what amount the prior is contaminating the posterior. I don’t care that most people don’t do this. It is not a problem with statistics; it’s a problem with people behaving incorrectly. We have the tools, even if we don’t use them.

        I think you make a too lineal and causal connection between the philosophy of statistics and real-world statistical practice. Whether this is good or bad, they are largely decoupled: as a statistician, I predict that the debates on Bayesianism vs. frequentist will never vanish completely; but as an applied statistician what I see is a great bunch of scientist with a large amount of data to analyze and without the time to wait until the debate settles down so that they can behave the way philosophers say. As I view it, the (mainstream) philosophy of statistics is a poor descriptor of the way I (we) make statistical analysis. This is indeed a philosophical problem in itself.

        Yes, you got my trick! The mantra “gather data until you find the desired effect”, or something in the way, is usually defended on Bayesian grounds (although even this is debatable). And, in my opinion, this is not a matter of legitimacy, but of honesty (again, philosophy of statistics decoupled from real-world practice). For example, which of these two quotes is more likely to be found on a published paper?: 1) “The regression of A on B was significant (beta = 0.17, P=0.04, n=100, one-tailed test)”; or 2) “Although the regression of A on B was significant (beta = 0.17, P=0.04, n=100, one-tailed test), the variance in A explained by B (3%) is not biologically relevant, so we assume that B has only a marginal effect on A”. That is, you can use a frequentist power analysis to find an optimum sample size for your experiment so that a desired magnitude of a one-tailed test is found. In this context I can say “Frequentism means never having to say you were wrong”.

        Re: And… pardon? Did you mean to equate a philosopher with an “explicit and precise” human being? Always?? Really?? Mmmm…

      • Re: Dennis’ rhetoric, fair enough, I suppose. But it is rhetoric, after all–criticizing it for not being literally true seems kind of like criticizing it for being, well, rhetoric. Dennis’ remark is an exaggerated claim he was using to dramatize and drive home a possibly-true (or valid, or reasonable, or etc.) point, and the context does make clear that that’s what he was doing. But fair enough if you believe all concerned if these debates would be better served by not making exaggerated claims to dramatize or drive home substantive points.

        Re: showing how much the prior influences the posterior, yes, you can do this. I entirely agree that it’s a sensible pragmatic thing to do. 😉 But it doesn’t address the underlying principles at stake here, and is not a substitute for familiarity with those principles. In particular, as you know, there is a subjective Bayesian school which views Bayesianism solely as a way of maintaining coherence of your personal beliefs, and which takes the view that it doesn’t matter where people get their priors. As Brad Efron notes, a large majority of practicing scientists are deeply uncomfortable with any use of prior “probabilities” on such a view, even if we can quantify how much the prior affects the posterior. That’s why so many practical scientific applications of Bayesian statistics use uninformative priors.

        Yes, absolutely, real world practice often is difficult to map onto philosophical ideals. I think this is for a variety of reasons, some of which reflect well on real world practice, and some of which do not. For instance, I’m coming round to the view that Gelman’s pragmatic Bayesianism, which involves a lot of frequentist-style error testing, and which he grounds on a philosophy which distinguishes between “normal science” (=”Bayesian”) and “revolutions” (=throwing out your existing model = “frequentist”), reflects well on him. Conversely (and just as one possible example among many), many real world frequentists do make a fetish of P values without worrying about whether their effect sizes are biologically substantial, which is a practice that principled frequentist philosophy criticizes.

        I just put up a post on new journal special issue, edited by Mayo and colleagues, about linking philosophical theory to real world practice. As I said in other comments, I used to have strong objections to pragmatism, but I don’t any longer–as long as the pragmatism is grounded in philosophy (as with Gelman), rather than grounded in ignorance of philosophy. I still think that “grounding” pragmatism solely in “practical experience” is another way of saying “ad hoc”. I still think pragmatism needs to be justified on *some* grounds or other besides “philosophy is remote from the real world, so I can just ignore it” or “philosophers don’t agree on what to do, so I’ll just do whatever I happen to feel like doing”. I’m reassured that some very smart pragmatists (Gelman; Bolker in ecology) agree with me on this. Ultimately, pragmatism about statistics ought to be grounded in a principled philosophy of science, by which I mean a principled justification or explanation for how scientific methods and approaches help us learn about the world. And don’t say that “my philosophy of science is to be pragmatic and do whatever works”, because that’s surely question-begging. As a Wittgenstein fan, I happily admit that justifications come to an end and that at some point we just have to act–but surely justifications shouldn’t come to an end before any aspect of our methodology has been justified at all! Even the most practical of scientists, with no philosophical training, ought to be able to articulate a rationale for why they do what they do.

        Out of curiosity (and I am genuinely curious), on what Bayesian grounds would one argue against the approach of “gather data until you find the desired effect” (note that I’m asking a narrow, technical question here) The frequentist grounds for not doing this are well-known. You say its “debatable” that a Bayesian would defend this approach–why? I mean, yes, I can imagine that a pragmatic Bayesian like Gelman could argue against this approach–but I suspect his reasons for doing so would basically be frequentist, or at least non-Bayesian. If you’re a “pure” Bayesian, of any school, then surely you believe that data that might have been sampled, but weren’t, are irrelevant–after all, there’s no place for such non-existent data in Bayes’ Theorem. And if you believe that data that might have been sampled, but weren’t, are irrelevant, I’m unclear what grounds you have for arguing against gathering data until you find the desired effect.

        I didn’t mean to equate philosophers in general with “explicit and precise”. But living philosophers of science and philosophers of statistics from English-speaking countries (and so heavily influenced, directly or indirectly, by analytical philosophy) are, in my modest but more-than-minimal experience, pretty much all explicit and precise, sometimes to the point of pedantry. I had hoped the context would make clear that I was not thinking of, e.g., Hegel, or mystics, or continental postmoderns, or etc.

      • Re: Yes, I do believe that rhetoric should be set aside in science. I mean, it’s ok for literature, for the arts, for politics, etc. In these areas it seems fair to gain a position by mocking someone. But in science, it is clearly a step back. Come on, it is very easy to uncover the inconsistencies of frequentism and to show the many ways by which it is wrong from a philosophical point of view; and the fact that most of scientists are (unconscious) frequentist by tradition has nothing to do with its strength as an approach. By the way, note that you seemed to be offended by a comment above stating that “p values make no sense”; a clearly rhetorical statement which I myself find unfortunate. Even though the context was clear, you went one with a long response trying to show how unfruitful can this attitude be. Perhaps the reason is that you are a frequentist, and felt alluded? 🙂

        I might give the impression that I am a Bayesian. Well, I’m not. Guess that if I have to define myself, I must say I’m a likelihoodist (ugly word). Bayesianism is full of subtleties, differing views and even strong confrontation within it (by the way, a thing I’m still waiting for within frequentism: a bit of self-critic; but see Hurlbert & Lombardi 2009 Ann Zool Fen, 46:311-349). Yes, subjective Bayesianism is a difficult position to defend; I agree on this. But, even here, there are some misunderstandings: for example, you speak of “uninformative priors”; what’s that? All priors are informative, whether they are vague (e.g., extremely platykurtic normal conjugate priors centered at 0), flat (uniform with proper limits), etc. But uninformative priors does not exist (read the interesting Bernardo 1997, J Stat Plan Inf, 65:159-189). That said, what about Empirical Bayes (EB) and objective Bayes? EB, indeed, is not regarded as a purely Bayesian inferential approach; you can have, for instance, a hierarchical model in which you estimate the upper levels of the hierarchy through ML estimation, and then conduct a Bayesian estimation of the lower levels conditioned on the ML estimates: there is your applied hybrid.

        Objective Bayes is even more interesting: you use reference (objective) priors that allow for the posterior to be objective in the sense loosely advocated by anti-bayesians: by depending solely on data; it is rather technical to explain how they work, but if anybody is interested and is unaware of how the objective Bayes derive inference, there are several interesting references by José Miguel Bernardo (http://www.uv.es/~bernardo/publications.html), who is the “inventor” of contemporary reference priors and one of the main developers of Bayesian mathematical-statistical theory during the last decades (indeed, the first formal general definition of reference priors appeared no long ago, even though they are several decades old: Berger et al. 2009 Ann Stat, 37:905-938); he and his colleagues are showing how to conduct Bayesian hypothesis testing, model checking, and so on from an objective point of view. This is a promising future for Bayesian statistics in ecology, I hope!

        You question on what grounds a Bayesian would argue against the approach of “gather data until you find the desired effect”? I did not say that a Bayesian would argue against that, I simply suggested implicitly that it is not all clear why should one bother by data not collected, even if you are a frequentist (I’m the devil’s advocate now, rather imprudent perhaps). This goes straight to the point of objectivity: In Bernardo’s words: “Objectivity is an emotionally charged word, and it should be explicitly qualified. No statistical analysis is really objective (both the experimental design and the model have strong subjective inputs). However, frequentist procedures are branded as “objective” just because their conclusions are only conditional on the model assumed and the data obtained. Bayesian methods where the prior function is derived from the assumed model are objective is this limited, but precise sense”. I mean, a Bayesian would say that a scientist should be concerned with understanding that the “true” value of a parameter lies within a given interval with a given probability given the particular data gathered, rather than with the frequency with which the estimation will cover the true value after a large number of trials not conducted, and perhaps never conducted (frequentism). It is all a matter of the “state of the world”. Do you believe that we can know objectively the state of the world, even that part of the world that cannot be observed (frequentism)? Or do you think that all we can honestly say is that with this prior belief, with this likelihood and with this data all I can say is that the posterior state of the world is this one (Bayesianism)? I mean, I don’t have answers, but I don’t think these issues are philosophicaly resolved in either case; you might finally say “well, if the priors are objective (they are somewhat “tuned” by the likelihood), why use priors at all and not being simply a likelihoodist? Well, here’s the difference: a Bayesian, by conviction, will always place a probability distribution on unknown quantities. And, for me, this seems rather more honest than speaking about frequencies of hypothetical states of the world.

        Ok, I was just joking about the “explicit and precise” nature of philosophers :). I do agree with you that philosophical issues should be at the base of any argument, and I fully agree, for example, that Jim Clark is neither doing a good service to the community by dismissing the philosophical problems on Bayesianism in his tremendous textbook. And yes, postmoderns and mystics are far from explicit and precise, but Debbie Mayo is. But if she’s making that elegant debunking of some Bayesian reasoning just to offer a frequentist alternative, I must say “No, I don’t buy that!”.

  7. […] (39 comments), although those comments included numerous trackbacks. Other popular posts were my compilation of resources on Bayesian vs. frequentist statistics (1167 views), my takedown of Gould and […]

  8. […] [1] https://oikosjournal.wordpress.com/2011/10/11/frequentist-vs-bayesian-statistics-resources-to-help-yo… […]

  9. […] the excitement of validation is also an underlying dichotomy in statistics – that between frequentists and Bayesians. While the details of this might be a bit too technical for a blog post (the earlier reference and […]

  10. http://www.xkcd.com/1132/

  11. […] overwhelming and difficult to wrap her head around. (Latest round of mental crisis sparked by an older Oikos post on the matter. Good links and comment section there, […]

  12. Hello Jeremy.

    I am a frequentist believing that one can define a physical (frequentist) probability for every event, even if it did not occur (as in the case of a putative technological singularity (TS) where AI would take over the planet).

    According to my own knowledge-dependent frenquentism, I would define this value p as the frequency of TS in worlds identical to everything we know about ours in a hypothetical infinite multiverse.

    If you found the time, I would be extremely glad to learn your take on the thoughts I have expressed in my blog post.

    I am extemely skeptical that propositions being either true or false (such as the truth of string theory, the theory of gravitation, God’s existence) have probabilities.

    In such situations I resort to likelihoodism and would rather just consider the probabilities of the evidence given the truth of the theory.
    I think it has the great advantage over Bayesian epistemology to be an objective quantity instead of just the intensity of a subjective conviction.

    And if p(E|H) or p(E|non H) cannot be computed, Bayesian calculations would fail as well so that I do not see any advantage.

    As a frequentist, I often feel like a lone fighter in a gigantic ocean populated by ferocious Bayesians determined to tear me apart before devouring me.
    So it was greatly encouraged to have found your blog 🙂

    And who know maybe we could join forces for a common paper one day in the future.

    Friendly greetings from Europe, Marc.

    P.S: your links are extremely useful!

  13. As a practising scientist, what matters to me is the false discovery rate. Each time I claim wrongly to have made a discovery, I make a fool of myself.

    In order to estimate a false discovery rate, all you need is the rules of conditional probability. There is no need to describe this as Bayesian. It can be inferred simply by counting. It seems to me to be absolutely clear that if you claim to have discovered an effect when you observe, say, P = 0.047, you will be wrong in at least 30% of cases (and a lot worse for small experiments). This fact alone means that reliance on P values has made a major contribution to the crisis of reproducibility in some areas of science. See http://www.dcscience.net/?p=6518

  14. Thanks for finally talking about >Frequentist vs.
    Bayesian statistics: resources to help you choose (UPDATED) | Oikos Blog <Loved it!

  15. Aw, this was a really good post. Taking a few minutes and actual effort to create a really good article… but what
    can I say… I put things off a lot and never seem to get anything done.

  16. @Jeremy: Thank you for this very interesting post!

    @all commenters (although I haven’t read through all comments yet): Thank you for enriching this post with your views on the topic.

    As the concept of different philosophical schools in statistics is a rather new discovery for me and I am thus trying to make sense out of it: How exactly do information theoretic approaches (as popularised by Burnham and Anderson) and likelihood approaches fit into the big picture? Are these just sub categories of frequentist or Bayesian statistics or would one count them separately? And is there a good textbook or article about the classification of statistical philosophy?


Leave a comment

Categories