Wednesday, January 04, 2017

UAH's 2016 Temperature: "Tied" with 1998, or a 66% Probability 2016 is Higher?

1/6: See update, below.

Yesterday Roy Spencer posted on his blog, with the headline "Global Satellites: 2016 not Statistically Warmer than 1998."

According to UAH's model, the average temperature of the lower troposphere for 2016 was 0.50°C, and for 1998 it was 0.48°C. UAH puts the uncertainty of the annual value at σ=0.05°C, so the 2σ error bars (which give (pretty approxiately) the 95% confidence limits), is 0.10°C.

So Roy concluded the two years are tied:
...they are basically tied, statistically. So to say 2016 is the warmest would be dishonest, since it ignores uncertainty in the measurements: a 0.02 deg. C change over 18 years cannot be reliably measured with any of our temperature monitoring systems.
John Christy apparently said the same.

Of course, 0.50°C is larger than 0.48°C, so I think there's some sleight of hand here, spun, I suspect, for the sake of headlines in Breitbart, Climate Depot, the Daily Caller and those kind of deniers.

I asked Roy what is the probability 2016 was the warmest year of the two years, but got no response from him. So I tried to calculate it myself; see what you think.

I assumed the two annual temperatures were each normally distributed, with the mean (best estimate) at the published numbers, 0.48°C and 0.50°C, and a standard deviation for each of σ=0.05°C.

So the picture is two Gaussian curves, side-by-side, with 2016's curve 0.02°C to the right of 1998's curve, both having a standard deviation of σ.

To calculate the probability that 1998 is the warmest year, I took it to be the area under the its Gaussian curve from 2016's best estimate, out to infinity.

0.02°C is a small difference between the two years' best estimates, but it's also 0.4 standard deviations (0.02°C/σ), and that isn't so small.

Normalizing the coordinates to unitless numbers, we want to area under the 1998 curve from x=0.4 to infinity.

The area of the normal distribution to the right of 0.4 is 0.3446, from this handy table. (It's straightforward to calculate, too, using the error function (erf(x)), but I got too tied up in getting the factors of 1/2 and √2π and the like correct, especially between the Wikipedia function and the Excel functions, so I just looked it up.) So

probability 1998 is the warmest year = 34%.

The probability 2016 was the warmest year is the complement of this, since we're only considering two possible years  (the third highest annual average, 2010, is 0.33°C, so not even close)

probability 2016 is the warmest year = 66%.

The chance 2016 was the warmest year in UAH's records is twice that of 1998.

This may not be the most mathematically rigorous way to do this, and I don't know if it's how Gavin Schmidt does it. But 66% isn't a surprising result; it "seems" believable (remember, it's 0.4 standard deviations higher).

So, statistical tie, or 2-1 odds?


Update, 1/6: Based on Nate's calculation on Roy Spencer's blog, I now think the right answer is his 61%. He used a 1-sigma margin of error of σ*sqrt(2), for the difference. For the difference of two numbers, D=Y-X, where X and Y each have a 1-sigma margin of error of σ, then the 1-sigma MOE of D is sqrt(σ22)=σ*sqrt(2) = 0.07°C in this case. Then we're looking for the probability that we're 0.02/0.05*sqrt(2)=0.28 standard deviations above zero, and the area to the right of that is, by the table listed above, 38.97%, so the complement of that, the probability that 2010 was the warmest year, is 61%.


This is worth reading; I'll write about it in my next post:

“The Myth Of The Statistical Tie,” David Drumm,, 10/6/2012


Unknown said...

Dr Spencer and Dr Christy are SO transparent. I wonder about these Christians sometimes, isn't one of their religious Commandments "thou shalt not bear false witness"?

2016 is clearly the warmest on record in the context of the data set as 0.50C is higher than 0.48C. Whether this result is statistically significant is another discussion - however that does not change the result.

Yes, false news in action again. They are spinning the headline to downplay the result.

JoeT said...

David, I think you're close. You have 2 normal distributions X with mean mu1, sigma1 and Y with mu2, sigma2. Then Z= X - Y and we want the probability that Z > 0. Then we know that Z has mean mu=mu1-mu2 and sigma = sqrt(sigma1^2+sigma2^2). There is also a covariance term for sigma, but I'm assuming that the distributions are uncorrelated.

If phi is the cumulative distribution (the integral from zero to some z), then the probability that Z > 0 = 1 - phi(-mu/sigma)=phi(mu/sigma)

Now we know see here for example that phi(x) = 0.5*erfc(-x/sqrt(2))

so that the probability that Z>0 = phi(mu/sigma) = 0.5*erfc((-mu/sigma)/sqrt(2))
since mu = mu1 - mu2 and sigma = sqrt(2)*sigma1 if sigma1=sigma2, then

P(Z>0) = 0.5erfc(-(mu1-mu2)/2*sigma1))

In your problem where sigma1 = 0.05 and mu1-mu2 = 0.02, the P(Z>0) = 61%

JoeT said...

Three additional points I would make.

1- First, I meant to say phi is the cumulative distribution (the integral from minus infinity to some z) --- not the integral from zero to some z.

2- I see that Nate over at Spencer's blog got the same 61% as I did. I am pretty confident this is the way to do the probability correctly. A couple of years ago I even did a Monte Carlo test comparing two normal distributions with different means and the same standard deviation. It gave the same result as the analytic expression:
0.5erfc(-(mu1-mu2)/2*sigma1)) = 0.5*(1 + erf((mu1-mu2)/(2.*sigma1)))

3- There was a University of Alabama Huntsville press release one can find here titled "2016 Edges 1998 as Warmest Year on Record". There is a quote from John Christy that says:

"Because the margin of error is about 0.10 C, this would technically be a statistical tie, with a higher probability that 2016 was warmer than 1998."

So even Christy admits that there is a higher probability that 2016 was warmer. However, I have no idea what a 'statistical tie' means.

David Appell said...

Joe, good comments. Nice catch on the press release.

David Appell said...

Joe: Nate got 61% because he used 0.05*sqrt(2) as the 1-sigma error for the difference between T(2010) and T(2011).

I think he's right. For the difference of two numbers, D=Y-X, where X and Y each have a 1-sigma margin of error of, call it s, then the 1-sigma MOE of D is sqrt(s^2+s^2)=s*sqrt(2).

kissmyarsenic said...
This comment has been removed by the author.
JoeT said...

I agree with Nate. If you look at my post above I wrote that sigma = sqrt(2)*sigma1 for the difference in two normal distributions. There is a covariance term as well, but it's reasonable to expect that to be zero. It's nice that we all get the same answer now.

However, the bigger picture here is what credibility does the UAH TLT data set have? If you check out the recent RSS news release you'll notice the following:

"RSS TLT version 3.3 contains a known cooling bias. We are working to eliminate the bias in the new version of TLT. Even with these known cooling biases, 2016 was a record warm year in TLT v3.3."

Since RSS v3.3 is virtually identical to UAH v6 it's not surprising that they both found 2016 to be 0.02C warmer than 1998. The cooling bias in the TLT data is the reason why the 61% probability isn't really accurate to begin with (the math is correct but the true errors involved are unknown to me at the moment)

Regardless of the politics, I'd be interested in understanding better why it seems to be so much harder to compensate for the diurnal drift in TLT than for TTT.

JoeT said...

I thought I linked to the RSS news release. It's here .

David Appell said...

Joe, yes, you linked above. But thanks for checking.

I think we've all narrowed in on an answer: Nate is right. The probability that 2016 was warmer than 1998 is 61%.