Saturday, April 07, 2012

Roy Spencer's Entertaining Polynomial Suggests End-of-Days

Roy Spencer's fit to the UAH data predicts the world will end in December 2170.

Every month when he announces the UAH lower troposphere global temperature anomaly for the previous month, he includes on his graph a 3rd-order polynomial fit

and writes, "The 3rd order polynomial fit to the data (courtesy of Excel) is for entertainment purposes only, and should not be construed as having any predictive value whatsoever."

Recently some people (including me) have complained, in part because the caveat doesn't always get replicated with the graph is reproduced, and in fact Spencer himself doesn't always replicate the caveat. That's misleading.

Some people defend it. Someone named "Bill A." even thinks it's on the short list of those approaching a realistic possibility."

Hmm. Let's see.

If you use Excel to do a 3rd-order polynomial fit to the UAH LT data you get

temperature anomaly = aD+ bD+ cD + d

where D is the date (as an integer; in Excel, D=1 is January 1, 1900), and where up through March 2012 the best fit is

a = -1.13719E-12
b = 1.21004E-07
c = -0.004228872
d = 48.51822318

This function peaks in November 2008. Hmm. For January 2020 it "predicts" the temperature anomaly will be -0.13°C, and it only gets worse from there.

By January 2035 it "predicts" an anomaly of -2.1°C
By Jan 2050, it predicts -7.0°C

and by January 2171 the anomaly will be -287.45°C, which, if you take the baseline to be 14°C, is below absolute zero and violates the 3rd law of thermodynamics.

Of course, it's absurd to project a fit that far, no matter what degree polynomial it is. But that's part of the point -- the 3rd order fit seems selected for the one that shows the data peaking and decline in the near future. (Actually, Excel says a 6th-order polynomial is a better fit, with a slightly higher R2 value.)

On the other hand, a linear fit "predicts" the anomaly will be about 0.7°C in 2050.

So, if you had to make a fit, which seems more realistic: 0.7°C, or -7°C?

But besides any of that, I think the data keepers should, above all, not be the ones using it for entertainment, especially in a way that seems to support their inclinations. Not everyone is going to get the joke.


charlesH said...

"is for entertainment purposes only"

I guess you were entertained.

Piltdown said...

It might be worth noting that a straight line is a first-order polynomial. I look forward to every graph showing linear least-squares fits to obviously non-'trend+iid Gaussian' data to bear similar caveats.

Or perhaps not everybody got Spencer's little joke?

Anonymous said...

David - you realise that Excel does not offer sine regression fit?

And that sine curves fit rather well the long term 65 or so year cyclic signals in HadCRUT v3, PDO, AMO and ENSO? Which over a half wavelength this particular polynomial simulates?

Dr Spencer is being 'playful' because he knows many climate scientists do not like this empirical statistical treatment of the data. Especially when the oceanic cycle can then be shown to account for 1/3 of temperature rise between 1900-2000 due to the respective trough and peak endpoint selection that these dates represent.

You might like to try a sinusoidal regression fit yourself, which can be easily done with a statistical package.

David Appell said...

Anon, you can do sine regression is you take the independent variable to be the sine of the year -- but you'd have to pick a frequency first. Then you could determine the phase angle and amplitude.

A sine curve wouldn't fit HadCRUT3 unless you added a term that had it increasing (approximately linearly) in time.

And you can always find apparent cycles in a time series of finite length, simply by doing Fourier analysis. That doesn't mean those cycles are real -- just an artifact of the boundary conditions. That may well be the case with the apparent 65-yr cycle, since we only have 1.5 cycles of records (and, as far as I know, only 110 years for the PDO).

Where it is shown how much the heat change is in, say, the Pacific Ocean between the maximum and minimum of the PDO?

And even if Spencer is "having fun," other people take him seriously, and Spencer himself fails to reproduce his caveat. My point was that there's no physical reason to expect the temperature to be a third-order polynomial.

Piltdown said...

All attempts to fit lines to data are always based on a statistical model of signal and noise. You assume the signal and noise are of particular forms, and you ask what is most likely given that assumption. If you assume a linear trend, that is what you will find, even if none actually exists. If you assume a cubic polynomial, or a sinusoid, or a sum of sinusoids of different frequencies, or a narrow bandwidth signal centred on zero, that's what you'll find. It doesn't mean that's what's happening. The choice of statistical model has to be justified on other grounds, such as the physics, and your conclusions are only as reliable as that justification.

Climate graphs in all sorts of places slap on linear fits and smoothed lines with no such justification. Some authors try to draw conclusions from them. Others put the lines on without comment, and allow the unwary reader to draw their own conclusions.

It takes advantage of the relevance fallacy - people will assume that if you give them a piece of information it must be because it is relevant, and try to fit the information into their interpretation. Guiding lines on graphs lead the eye. The result is usually to make a conclusion 'obvious' that is not in fact justified by the data. It's often honestly done: the author has drawn their own conclusion and believes it to be true, and is simply trying to make the explanation clearer.

But with a different statistical model the 'obvious' conclusion changes, and seeing it, you start to realise that the conclusion isn't in the data after all.

Depending on your model, you can make it look like the data is still climbing upwards or peaking and turning downwards. All the orthodox climate scientists always make sure it's still climbing upwards. Spencer's joke is to show what they're doing.

Of *course* Spencer picked it deliberately to show a result favourable to scepticism. And I believe he chose not to explain it deliberately, as many people using linear trends don't explain, to provoke just such a reaction. The idea is to get *you* to bring the subject up, discuss it, publicise it, and in the process to explain how your *own* data-fitting tricks work.

If Spencer explained it, none of you would listen. The joke is that he gets you to explain his point for him, while thinking that you're correcting his mistake.

When I argue this point, my usual approach is to give someone an AR(1) process to play with, and get them to fit a straight line to the first couple of hundred points. "Do you see a trend?" I ask. Then I show them a few thousand points. Clearly there is no trend: what they had interpreted as signal was just noise.

There are lots of even better ways to do it, I'm sure.

When you're equally hard on both sides for doing it, the joke will lose its force.

NnN said...

"Climate graphs in all sorts of places slap on linear fits and smoothed lines with no such justification."

The justification for linear fits is in the physical models of climate that predict an approximately linear path of warming with an approximately linear increase in energy gain.

The justification for smoothing the data is to reduce decadal noise from eg ENSO which again is a physical justification.

On the otherhand there is no physical justification for fitting arbitrary curves to the data. There is no physical basis for global temperature to follow a third-order polynomial over any particular period. It's just an exercise in meaningless nonsense, or "entertainment" as Spencer puts it.

Piltdown said...

"The justification for linear fits is in the physical models of climate that predict an approximately linear path of warming with an approximately linear increase in energy gain."

1. Exactly. You start with a model and you find just what you're looking for.

2. None of the climate models predict straight lines.

"The justification for smoothing the data is to reduce decadal noise from eg ENSO"

ENSO isn't noise. It's real weather/climate. Rounding errors and station moves and UHI are noise.

Jonas N said...

Entertaining indeed, David Appell writes bloggposts and comments 'discussing' what a polynomial fit will do as it extends to +/- infinity. Even tries to make a 'clever' point out of it.

And has been pointed out already: Fitting a straight line (linear 'trend') to both observations and/or models which gite quite large variations and fluctuations of similar timescales makes just as little sense.

Bottom line: Predictions by extrapolating curve fits into the future carry no explanatory value. A physical model capturing all relevant mechanisms might. But just might .. mind you!

David Appell said...

Jonas, it makes much more sense to fit the data to a linear trend than to a 4th-order polynomial.

All functions can, over "short" periods of time, be estimated by their linear fit -- it's just the first term in their Taylor expansion.

David Appell said...

I didn't do a sine regression fit, I did a 3rd order polynomial fit, which Excel does offer.

David Appell said...

Jonas N said...
"Bottom line: Predictions by extrapolating curve fits into the future carry no explanatory value."

That was exactly my point! (Yet Roy Spencer did it anyway.)