Bamboo, cicadas, and integers

I enjoyed this article, “Bamboo Mathematicians,” by Carl Zimmer over at the National Geographic website, but I am skeptical of some of what it says. In particular, the math seems fishy.

The gist of the article is that there are species of bamboo that flower and spread their seeds very rarely—once every 120 years for one species. Weirder still is that these bamboo are in sync with one another. No matter where they are in the world, no matter what conditions they’ve been subjected to, every member of that species flowers and seeds the same year. Other bamboo species are also synced, but to shorter periods: 60 years in one case, and 32 years in another.

The accepted reason for the synchronicity, developed in the 70s by biologist Daniel Janzen, is that the masses of seeds generated in those special years overwhelm the animals that feed on them. A larger percentage of seeds survive and germinate than otherwise would because the animals simply can’t eat them all.

Zimmer’s article was prompted by a recent paper that tries to extend the theory by explaining how the particular periods of 32, 60, and 120 years arose. They believe the periods, which are multiples of small numbers only, not only convey a survival benefit, but are the easiest to mutate to. As Zimmer says,

Veller and his colleagues realized that they could test this model. Over millions of years, they reasoned, species should have multiplied their flowering cycles. It’s likely that they could only multiply the cycles by a small number rather than a big one. Shifting from a two-year cycle to a two-thousand-year cycle would require some drastic changes to a bamboo plant’s biology. Therefore, the years in a bamboo’s cycle should be the product of small numbers multiplied together.

The mathematics of bamboo offers some promising support. Phyllostachys bambusoides has a cycle of 120 years, for example, which equals 5×3×2×2×2. Phyllostachys nigra f. henonis takes 60 years, which is 5×3×2×2. And the 32 year cycle of Bambusa bambos equals 2×2×2×2×2.

But could this just be a kind of meaningless bamboo numerology? Is it just a coincidence that these species display such elegant multiplications? Veller and his colleagues carried out a statistical test on bamboo species with well-documented flowering cycles. They found that the cycles are tightly clustered around numbers that can be factored into small prime numbers. It’s a pattern that you would not expect from chance. In fact, they argue, this test offers very strong evidence for multiplication (for stat junkies: p=0.0041).

I’m certainly no biologist, and the paper apparently includes compelling genetic evidence that traces the branching of bamboo’s evolutionary tree, but the purely mathematical explanation strikes me as odd for a couple of reasons.

First, there are the cicadas. Periodic cicadas use the same strategy as the bamboo to overwhelm their predators and increase the odds of survival. But their periods are prime numbers, 13 and 17 years, and the reason for that, we’re told, is that prime numbers are the best bet for survival. Stephen Jay Gould gave exactly that explanation in his essay “Of Bamboo, Cicadas, and the Economy of Adam Smith.” You can find the essay in his collection Ever Since Darwin, and as you can probably guess from the title, he also talks about the synchronized flowering of bamboo.1

So on the one hand, we have an older mathematical explanation that says prime numbers are the key to long-period survival adaptations; and on the other hand, we have a new mathematical explanation that says multiples of small numbers are the key to long-period survival adaptations. While this does not necessarily mean that one of the explanations is wrong—evolution can lead to different paths being taken to achieve the same end—it’s surprising to me that Zimmer didn’t mention the apparent contradiction. He’s generally considered one of the best science journalists, and I’m sure he knows the cicada story.

The second reason I thought the multiples-of-small-numbers hypothesis was odd was my sense that it was pretty common for integers in the range of interest (dozens to hundreds) to factor into just a few small primes. To test this, I fired up IPython and started playing around.

I started with this function for getting a list of prime factors:

def prime_factors(n):
  f = []   
  d = 2
  while n > 1:
    while n % d == 0:
      n /= d
    d += 1
  return f

I stole it from this discussion at Stack Overflow. It works like this:

In [2]: prime_factors(150)
Out[2]: [2, 3, 5, 5]

To get the prime factors of all the integers from 2 through 200, I made this list comprehension:

all = [ (n, prime_factors(n)) for n in range(2, 201) ]

Following the ideas in Zimmer’s article, I made a list of all the integers whose largest prime factor is 5:

max_fives = [ x[0] for x in all if max(x[1]) <= 5 and x[1] > 5 ]

As you can see, I also eliminated 2, 3, and 5 from this list, as I didn’t want to include such short periods. Here are the 41 integers in max_fives:

[6, 8, 9, 10, 12, 15, 16, 18, 20, 24, 25, 27, 30, 32, 36,
40, 45, 48, 50, 54, 60, 64, 72, 75, 80, 81, 90, 96, 100,
108, 120, 125, 128, 135, 144, 150, 160, 162, 180, 192, 200]

So if we’re talking about periods on the scale of Phyllostachys bambusoides, values that are multiples of 2, 3, and 5 only aren’t especially rare. For comparison, there are 43 prime numbers in that range.

And apparently there’s some fuzziness to the periods of these bamboo species. Note that Zimmer says “the cycles are tightly clustered around numbers that can be factored into small prime numbers” (emphasis mine). Let’s expand our results to integers that are within one of the numbers in max_fives.

clustered = set()
for n in max_fives:

There are 105 integers in clustered, which is over half of the integers in question. This makes me suspicious of that p value of 0.0041 quoted in Zimmer’s article.

You may argue that by restricting the sample space to integers up through 200, I’ve cooked the books to make these small integer multiples more common than they would be if we looked at a larger range. I thought it was reasonable to stop at 200, since the longest cycle we know of is 120 years. But you’re right that multiples of small numbers become less common as the range gets larger. If we’d defined all to go up to 500, for example, then max_five would have only 62 integers and clustered would have only 168. This is a distinctly lower proportion of the total, but it’s still substantial.

I could, of course, buy the original paper if I really wanted to know how the authors came up with that p value, but scholarship isn’t a hallmark of blog posts. Plus, I mainly just wanted to play around in IPython.

Update 5/23/15 10:04 AM
OK, fine. Aaron Meyer tweeted me a link to a PDF that describes the statistical work. It’s 37 pages long, and I certainly haven’t dug into it properly, but the gist seems to be that although numbers that factorize into small primes (NFSP) are relatively common, what’s uncommon (and leads to the low p value) is that so many bamboo species that have long-period synchronized flowering have cycles that match NFSPs. This is analogous to coin flipping: flipping a coin and getting heads is no big deal, but flipping ten coins and having them all come up heads is. Zimmer’s article didn’t say how many species of bamboo were studied, but I supposed I should have guessed that it was more than just the three he mentioned.

Update 5/23/15 5:42 PM
More biology enthusiasts here than I’d’ve guessed. Reader Bror Jonsson emailed me a link to the original paper, thus screwing Wiley out of tens of dollars.2 If you look through it, you’ll see that the authors do discuss periodical cicadas, which was also pointed out to me on Twitter by Pete Carlton.

  1. Gould’s essay was written in the 70s, shortly after Janzen’s work, and includes no theories on why bamboo would have periods of 32, 60, and 120 years. 

  2. Extra points to Bror for having a Bob Dylan quote in his email signature.