Over the past couple of years Oikos has published a bunch of really thoughtful and interesting papers on the use of maximum entropy (‘MaxEnt’) in ecology; see the special feature in the April 2010 issue and references therein.
The debate over MaxEnt, both in Oikos and in other journals, has mostly been about the advantages and pitfalls of using MaxEnt as a conservative statistical inference technique. Given certain information about a system (‘constraints’), what is the most we can infer about other features of the system without making additional implicit or explicit assumptions? MaxEnt is a way to answer this question.
I don’t have much to add to this debate myself. Instead, I wanted to call ecologists’ attention to some very deep and novel ideas about MaxEnt from evolutionary biologist Steven Frank. In focusing on MaxEnt as a technique for conservative statistical inference, ecologists may be neglecting some related but equally important conceptual issues.
One idea of Frank’s is that species abundance distributions (and presumably other macroscopic statistical patterns in ecology and evolution) look the way they do not because of any particular ecological mechanism, or even because they have to be consistent with certain known constraints, but because of their natural measurement scale (Frank 2011). From this perspective, MaxEnt is a way of figuring out what the species abundance distribution should look like just based on our choice of measurement scale (and on the mere fact that it’s a statistical aggregation). It’s only deviations from this expectation that require any special ecological explanation.
A second idea of Frank’s is that there’s a deep analogy between scientific measurement, and evolution by natural selection (Frank 2008). Natural phenomena can be viewed as containing an intrinsic amount of ‘information’. Measurement is a way of extracting this information and transferring it to our data. Analogously, differential survival and reproduction of individuals with different phenotypes can be viewed as a way for a population to ”measure’ information about the environment and transfer it to the population, resulting in adaptive evolution (adaptive change in allele frequencies). Remarkably, it turns out this rather vague-sounding analogy can actually be made extremely precise. Frank shows that, if you maximize the amount of Fisher information about the environment captured by changes in allele frequencies, you obtain…(wait for it)…Fisher’s Fundamental Theorem of Natural Selection (!) In other words, selection maximizes what an evolving population ‘learns’ about the environment in which it is evolving. Frank’s paper includes a forthright discussion of whether this is merely some kind of formal coincidence, or whether it indicates a deep and not-yet-fully understood link between natural selection and information theory (a link ironically not recognized by Fisher himself, even though he was a central figure in the development of both evolutionary theory and information theory).
I’ve been wondering if this second idea might not have some direct implications for recent applications of MaxEnt in ecology. Bill Shipley has argued that MaxEnt is a way to infer species abundances along environmental gradients (e.g., Shipley 2010 Oikos). The idea is essentially the same as Frank’s: different species have different demography due to differences in their phenotypic traits, so that different species are best adapted to different environments. Information about species’ traits (e.g., mean trait values of the species found in different environments) can thus be used to infer their abundances. But Shipley says that the way to infer species’ abundances from information about species’ traits is to find the species abundances that maximize a measure of entropy called Shannon entropy (=Shannon information), conditional on the observed trait information. But in light of Frank’s work, I wonder if we shouldn’t be maximizing a different measure of entropy (Fisher information) instead. Perhaps the measure of entropy that you should maximize depends on whether you’re merely using MaxEnt as a tool to make statistically-conservative inferences, or whether you’re using it because you want to detect a signal of selection. Or maybe I’m confused here because I haven’t yet fully wrapped my head around some very deep ideas. Hopefully some commentators will chime in.
UPDATE: here is a nice blog post on natural selection and information from a physicist. UPDATE #2: Nice comment thread going on this post, which helps to clarify some of Frank’s ideas, and indicates that the tentative contrast I drew between Fisher and Shannon information is a little misplaced.