Kangaroos, Foster’s, and E.T. Jaynes

Also maybe some thoughts on priors

priors
stupid Bayesian stuff
Author
Published

January 26, 2025

Ahead of starting my current gig last February, I was reviewing a bit about multilevel models, and came across this fascinating footnote in Richard McElreath’s Statistical Rethinking:

See Jaynes (1986) for an entertaining example concerning the beer preferences of left-handed kangaroos. There is an updated 1996 version of this paper available online.

I’m not sure what it is with Bayesians and wild, fantastic adventures in the footnotes1, but obviously I had to chase this one down. Also, the paper didn’t immediately pop up when I googled “beer preferences of left-handed kangaroos”, so hopefully this post gets a few people where they’re going.

I mostly started this post to have a reason to DALL-E a (left-handed) kangaroo drinking a Foster’s, but the more serious content of the post discusses different conceptions of maximum entropy, and how the idea can be useful in building prior distributions, even if we shouldn’t stop at a maximum entropy prior.

Why on earth are we talking about kangaroos’ beer preferences?

To be clear, we’re actually talking about the joint distribution of beer preferences of kangaroos and their handedness. Why?

In the time-honored tradition of taking a joke and continuing to run with it until it’s funny, E.T. Jaynes is extending an example due to Steve Gull2 to discuss the properties of maximum entropy priors:

Extending analysis of this scenario then provides a tutorial in prior specification, improving the prior slowly to ensure it reflects all we know about the example.

For example, the first piece of intuitive prior knowledge that Jaynes shares is that kangaroos are (for most practical purposes) indivisible3. Constraining acceptable priors to ones with integer solutions tweaks the problem and corresponding reasonable priors a bit.

From there, the article provides a nice example of a difficult Bayesian task: taking our priors seriously, and seeing how we feel about what they imply. For example, when taken to large N4, the simplest maximum entropy prior starts to become remarkably confident about the unknown proportions p above. The question that arises is whether our intuition about p were poor, or whether we have hidden prior information to incorporate (i.e: do we know that kangaroos are likely to be related given they are drawn from the same genetic pool and environment)?

As I often ask myself…

With the ensuing sermon5, Jaynes makes the case that maximum entropy is a useful, logical tool for building priors. Interestingly, this argument has a different orientation to maxent than more recent pieces that place these priors alongside reference, Jeffreys, or invariance-based priors6. Jaynes instead lays out a presentation that feels more like a workflow, with maximum entropy itself providing a starting point.

But first, more kangaroo facts

Just like our understanding of Bayesian inference has advanced since the 1980s, so has our understanding of kangaroo handedness.

Giljov et al. (2015) study several members of the broader family of macropod7 marsupials, and find that bipedal macropod marsupials show “population-level left-forelimb preference”, confirming Gull’s observations above.

This was studied by observing the natural behavior of the different species in a variety of common tasks, like bipedal/tripedal8/quadrupedal stances for feeding, self-grooming, and accepting a nice cold can of Foster’s9.

Ok, so Gull was correct they’re broadly left-handed. How accurate was his 65% observation? Here we need to be careful: there are a variety of macropod marsupials, which display different degrees of left-handedness10. As Jaynes notes, Gull was not particularly specific in his formulation of this problem:

Although there are several species of kangaroos with size varying from man to mouse, we assume that Gull intended his problem to refer to the man sized species (who else could stand up at a bar and drink Foster’s?)

This seems reasonable enough, but to split unnecessary hairs even further, there are two different types of “man sized” kangaroos in the study- red kangaroos and eastern grey kangaroos1112. Digging into the appendix to find the raw data since the main paper only reports statistical test results, 80% of red kangaroos are left-handed, and 68% of eastern grey kangaroos are13. Perhaps Gull was observing his kangaroos in the eastern part of Australia, which (unsurprisingly) is where eastern grey kangaroos can be found.

As much as we now stand on the shoulders of giants when it comes to Bayesian inference and kangaroo handedness, I wasn’t able to find any work on the beer preferences of kangaroos. That said, I see no reason to doubt Gull’s 65% estimate, so let’s move on to talking about priors.

Ok, some actual thoughts about priors

In grad school, I was lucky to take a Bayesian inference course from Ben Goodrich- he came down from Columbia to teach at NYU. I spent some time in his office hours asking questions about maximum entropy.

As I remember it, his position was that maximum entropy could be a useful concept for building some intuition about probability, but that it wasn’t a good way to land on priors. Richard McElreath’s take in Statistical Rethinking is similar, if a bit more enthusiastic on the intuition building point- chapter 10 develops the concept at length for this purpose, but not encouraging it as a tool for making priors.

From both sources of Bayesian education, along with other formative ones like Andrew Gelman’s writing, I’m pretty broadly convinced that our workflow for building priors should usually aim at specifying weakly informative priors14. Without getting too deep into that point, I think the Stan Prior Choice Recommendations page summarizes the intuition well:

Weakly informative priors should contain enough information to regularize: the idea is that the prior rules out unreasonable parameter values but is not so strong as to rule out values that might make sense

The slight distinction I want to make here, however, is that thinking about maximum entropy is a powerful way to grapple with the space of possible priors for a given problem. For most complex problems, even clearly grasping what our priors imply can be a challenge.

In this type of context, trying to clearly state a maximum entropy prior can be deeply orienting. Once we know the minimum amount of information structural constraints encode, it becomes a lot easier to think about what “weakly informative” might mean, adding any further information on top of maxent.

As we saw above with our beer-swilling Kangaroo friends, even the initial question of what maximum entropy implies can turn out to be a surprisingly deep question. Examples abound where “uninformative” seeming priors accidentally imply humorously absurd results. Asking what maximum entropy might mean can be a way to start unraveling such situations, stripping things back to a simple baseline that we can then build further on.

Conclusions

Reading this Jaynes piece, I’m struck by how much of his process for developing priors reminds me of the modern Bayesian Workflow for developing priors. His method of working with the idea of maximum entropy demonstrated in the piece is far richer than the basic notion of maximum entropy distributions would suggest. Even if we shouldn’t stop at a maximum entropy prior, doing “weakly informative” right can benefit from formulating what we believe is minimally coherent for our problem.

Hopefully you’ve enjoyed this post, and have been convinced to think about maximum entropy a bit more on your next prior specification journey. And if you enjoyed this, you truly should read the original paper - if this table contents doesn’t sound like a good time, I don’t know what does.

Footnotes

  1. Dan Simpson’s are particularly incredible.↩︎

  2. Who I have to imagine Australians were thrilled with for this novel joke. I’m sure if the dropbear or ʇxǝʇ uʍop ǝpᴉsdn jokes existed back then, this problem would have been about the preferred reading angles of drop bears or something similar.↩︎

  3. If one did divide a kangaroo, would it still have a preference for Foster’s? I suppose it would depend on the type of division, but I have to imagine most types of division would leave the kangaroo desiring something a bit stronger.↩︎

  4. umber of kangaroos.↩︎

  5. his wonderful term, not mine.↩︎

  6. For example, how the Stan Prior Choice Recommendations page, or Gelman, Simpson, and Betancourt (2017) which I’ll discuss later in the post.↩︎

  7. See Macropodidae- basically all the cute, friend-shaped ones, including kangaroos, wallabies, sugar gliders, quokkas, and wallaroos (which I just learned of, and am happy to report exist).↩︎

  8. Three limbs, not the tail, sadly.↩︎

  9. Ok, ok fine. The authors restricted their attention to “natural, not artificially evoked, behaviors”, and that condition is what presumably ruled out handing any of them a Foster’s.↩︎

  10. The motivating causal theory here is that handedness seems to be more common in primarily bipedal species, according the lead author. This connects this work to a larger, genuinely fascinating debate on whether handedness is a uniquely human or primate trait.↩︎

  11. There are also two other species referred to as kangaroos, the western grey kangaroo, and the antilopine kangaroo. Note that the term “kangaroo” is a paraphyletic grouping, and seems to be based on size, so, for example, I found some references to the antilopine kangaroo being referred to as a wallaby or wallaroo. I was thus unable to verify Jaynes’ claim of mouse-sized kangaroos, and presume he was thinking of other Macropodidae. Fortunately, neither of the other two species should undermine our confidence in the study’s ability to tell us about the man sized kangaroos capable of standing up at a bar and drinking: while both the eastern and red kangaroo can easily stand over a bar, with male heights often observed roughly around 6’7” and 5’11” respectively, the western grey kangaroo only reaches typical heights of 4’3” (unclear if this is males only, or all of them), and the antilopine kangaroo males only reaching 3’9”. If we instead restricted our attention to just the typical height of a bar however, and take a typical bar height to be 42”, we might have some further decisions to make, as both species would (for the tallest members) only narrowly be taller than the bar, and might not be able to comfortably drink at one. In addition, we would need to take into consideration the bartender’s willingness to serve such small patrons…↩︎

  12. Second footnote to say all average male heights in the first footnote are just from the linked Wikipedia pages. I’m in a footnote for a footnote here, which in turn is referring to a paper I found to validate a comment in a 28 year old paper referencing a 41 year old anecdote. If you nitpick me on the roo heights with more precise information about the distribution of their heights, I’ll laugh at you, but I will update the text. If you argue whether we care about length of the roo (including tail), not height, I’ll just laugh at you.↩︎

  13. I’m taking them at their word on whether each kangaroo is left-handed or not based on counting the number of left vs. right pawed observations of each action. You can find the relevant data and make your own conclusions in tables S3/S4 of the appendix, or use the summary of handedness by task I’ve included here.↩︎

  14. Of course, there are plenty of times when we might want to build a genuinely informative prior, and state clearly why we’re doing so.↩︎

Reuse

Citation

BibTeX citation:
@online{timm2025,
  author = {Timm, Andy},
  title = {Kangaroos, {Foster’s,} and {E.T.} {Jaynes}},
  date = {2025-01-26},
  url = {https://andytimm.github.io/posts/kangaroos_fosters_jaynes/kangaroos_fosters_jaynes.html},
  langid = {en}
}
For attribution, please cite this work as:
Timm, Andy. 2025. “Kangaroos, Foster’s, and E.T. Jaynes.” January 26, 2025. https://andytimm.github.io/posts/kangaroos_fosters_jaynes/kangaroos_fosters_jaynes.html.