Thursday, June 05, 2014

How Often is a Temperature Trend Significant?

I've finally learned enough Python in my spare time to calculate something I've wanted to do for awhile: begin to answer how often past temperature trends have been signficiant.

That is, if you calculate the all possible N-year linear trends for a dataset, how often are their statistical significance 95% or higher, for all N?

Not as often as you might think.

I used HadCRUT4 surface data, which begins in January 1850. So for N=2 (2 years), you can calculate a linear trend from 1/1850 to 12/1851, from 2/1850 to 1/1852, from 3/1850 to 2/1852, etc., up to 24 months before the present.

Of these 1,949 trends for N=2, how many have been statistically significant at the 95% level?

50%, if you don't include any autocorrelation. Obviously, with autocorrelation it will be smaller, because the error bars on the trend will be bigger.

I did this for N from 2 years to 99 years. Here's my result:

So 10-year linear trends of HadCRUT4 have only been statistically signfiicant 65% of the time. 20-year trends 76% of the time. Etc. (This assumes, as usual, that the residuals of the linear fit are normally distributed.)

And, mind you, this is without any autocorrelation factored in, not even rank-1 autocorrelation between a temperature point and its nearest neighbor, let alone the exponential decay method of calculation autocorrelation described in Foster and Rahmstorf (2011) and used by the Skeptical Science trend calculator.

I'll get to those soon. But in the meantime, even this basic calculation shows that talking about 10- or 15 or 18-year trends, like Werner Brozek likes to do at WUWT, is meaningless, just mindless numerology.

Nor, of course, is there anything magical about 95%. It means the result is significant to about 2 standard deviations, and it's the common benchmark for medical studies, but high-energy physicists (say) often more, as, for example, the 5-sigma requirement for the discovery of the Higgs boson. In climate science, which isn't an experimental science (viz. you can't do as  many repetitive experiments as you want), you can't get that kind of data -- you take the data you can get. So like we're supposed to take action if a warming trend is 96%, but not if it's 94%,or 80%, or 60%? That's absurd, of course.

Numerology is not a substitute for thinking. And short intervals are not meaningful.


ATTP said...

Doug McNeall has a short post about statistical significance that seems relevant. Makes a similar argument to what you suggest at the end of your post.

William Connolley said...

> Temperture

Spelling. Otherwise, nice post, thanks.

You could play with generating a random series with such-and-such auto correlation, and see how that looked in your analysis.

Frank1123581321 said...

Let's rephrase your question: "How often is a temperature trend significantly different from zero? Why should a temperature trend be different from zero? How about during recovery from the Little Ice Age? Fortunately, even BEST hasn't tried to go that far back in time. However, even the IPCC admits that long-term forcing from GHG's couldn't be detected in noise of natural unforced variability until the second half of the 20-century. And they detected this forced trend against the background of natural variability using climate models that may overestimate climate sensitivity and underestimate unforced variability.

A proper analysis of pre-1950 trends shouldn't find any that are significantly different from zero because all of those changes are believed to represent unforced variability. Unfortunately, detection of trends in chaotic systems can be quite tricky. See Lorenz, Chaos, spontaneous climatic variations and detection of the greenhouse effect. "Consider now a second scenario where a succession of ten or more decades without extreme global-average temperatures is followed by two decades with decidedly higher averages .... Certainly no observations have told s the decadal mean temperatures are nearly constant under constant external influences."

Of course, you could detect statistically significant trends merely by chance. Sounds crazy - significance by chance. However, p values around 0.05 do occur by chance about 5% of the time. Data mining of this type turns up results that are meaningless despite their apparent statistical significance.