Wednesday, February 8, 2012

Context is everything

An interesting paper in Science by Stumpf and Porter takes a hard look at "general" power laws in science:
"A striking feature that has attracted considerable attention is the apparent ubiquity of power-law relationships in empirical data. However, although power laws have been reported in areas ranging from finance and molecular biology to geophysics and the Internet, the data are typically insufficient and the mechanistic insights are almost always too limited for the identification of power-law behavior to be scientifically useful (see the figure). Indeed, even most statistically “successful” calculations of power laws offer little more than anecdotal value."
One argument that they use is that power laws are a statistical feature (or artifact, depending on your point of view) of probabilistic theory:
"Suppose that one generates a large number of independent random variables xi drawn from heavy-tailed distributions, which need not be power laws. Then, by a version of the central limit theorem (CLT), the sum of these random variables is generically power-law distributed." 
So finding statistical evidence for power laws (which is not that straight forward, according to the authors)  is not even the main problem, finding "a generative mechanism" should be the main focus of scientific pursuit. This is a roundabout way for these mathematicians to suggest the importance of the scientific method, where the context (question, hypothesized causal mechanism, prediction) drives the statistical test that will be tested with new data.

This is one of the biggest hurdles when teaching the scientific method to undergrad (and sometimes graduate) students (and faculty?). We get so many questions that start out with "Is this a good test?", and we always have to get back to them with our standard reply "Context is everything": this test means nothing to us without knowing the context. And now I can refer them to this article as a nice example of the fallacy of focusing on the stats in isolation of the rest.

In addition, the authors also provide a subtle argument against the seductive power of induction. Their concluding sentence reads:
"The most productive use of power laws in the real world will therefore, we believe, come from recognizing their ubiquity (and perhaps exploiting them to simplify or even motivate subsequent analysis) rather than from imbuing them with a vague and mistakenly mystical sense of universality."

No comments:

Post a Comment