Monday, September 22, 2014

Trend since 1998—significant??

I had a question sent to me about the trend since 1998.  As many of you know, my last post included an analysis which showed that the linear regression trend since 1998 was statistically significant.

Trends versus start year.  Error bars are the 95% confidence intervals.
My questioner asked if I had accounted for autocorrelation in my analysis.  The short answer is "No, I did not."  The reason?  According to my analysis, it wasn't necessary.

Here are my methods and R code.

#Get coverage-corrected HadCRUT4 data and rename the first two columns
CW<-read.table("http://www-users.york.ac.uk/~kdc3/papers/coverage2013/had4_krig_annual_v2_0_0.txt", header=F)
names(CW)[1]<-"Year"
names(CW)[2]<-"Temp"

#Analysis for autocorrelation—I check manually as well but so far the auto.arima function has performed admirably.
library(forecast)
auto.arima(resid(lm(Temp~Year, data=CW, subset=Year>=1998)), ic=c("bic"))

The surprising result?
Series: resid(lm(Temp ~ Year, data = CW, subset = Year >= 1998))
ARIMA(0,0,0) with zero mean    

sigma^2 estimated as 0.005996:  log likelihood=18.23
AIC=-34.46   AICc=-34.18   BIC=-33.69
 I was expecting something on the order of ARIMA(1,0,1), which is the autocorrelation model for the monthly averages.  Taking the yearly average rather than the monthly average effectively removed autocorrelation from the temperature data, allowing the use of a white-noise regression model.

trend.98<-lm(Temp~Year, data=CW, subset=Year>=1998)
summary(trend.98)
Call:
lm(formula = Temp ~ Year, data = CW, subset = Year >= 1998)

Residuals:
     Min       1Q           Median        3Q          Max
-0.14007  -0.05058   0.01590    0.05696    0.11085

Coefficients:
                    Estimate       Std. Error    t value    Pr(>|t|) 
(Intercept)   -19.405126   9.003395    -2.155     0.0490 *
Year             0.009922      0.004489     2.210     0.0443 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.08278 on 14 degrees of freedom
Multiple R-squared: 0.2587,    Adjusted R-squared: 0.2057
F-statistic: 4.885 on 1 and 14 DF,  p-value: 0.04425
The other surprise?  That the trend since 1998 was significant even with a white-noise model.  Sixteen data points is not normally enough to reach statistical significance unless a trend is very strong.

Temperature trend since 1998

Sunday, September 21, 2014

The "no warming" claim rises from the dead yet again.

Like a movie vampire, this one keeps coming back no matter how many stakes are driven through its heart.  I've covered this one (here, here, and here).  Bluntly: There is absolutely no evidence that global warming has stopped.  For global warming to stop, the Earth's energy balance must be either zero or negative.  The most recent estimates for the energy imbalance are generally between +0.5 W/m2 and +1.0 W/m2.  The only way the Earth is not going to warm while it is gaining energy is if the laws of thermodynamics magically do not apply.  If the Earth is gaining energy, some part of it, somewhere, must be getting warmer.  The heat must go into either melting ice, warming the oceans, warming the land, or warming the atmosphere (or some combination thereof).

Ah, but what about the atmosphere?  Has warming in the atmosphere stopped?  Again, a blunt "No."  Almost every claim of "no warming" comes from someone trying to start a linear trend line at 1998—an outlier year which saw the strongest El Niño event on record.  How do we know 1998 was an outlier?  Simple statistics.

First, I took coverage-corrected HadCRUT4 monthly temperature data between January 1970 and December 2013 and calculated the yearly average.  I then fitted a loess regression to that data.  The result?





I then plotted the residuals for that plot.  For those who are not statisticians, residuals are calculated by subtracting the predicted value based on the trend from each data point.  Residuals show how far each data point deviates from the trend, with a point right on the trend coming in with a residual of zero.  Taking the residuals also removes the trend from the data, which is handy for analyses for cycles and the like that occur around the trend.


Note that bright red point?  That's 1998.  Notice that it deviates farther from the trend than any of the points after it?  That's important as it essentially tells us that if we start a linear trend from 1998, that single data point will influence the trend more than any other year from 1999 to 2013.  Now I can hear someone saying, "But that's with a fancy-schmancy nonlinear regression.  What does plain old linear regression show?"






The residual plot reveals the following:


It doesn't really matter which regression technique you use.  1998 was an outlier by any definition.  How does starting a linear regression near 1998 affect the trend?  Quite a bit.  First, as the start point gets closer to 1998, the beginning of the trend line gets "pulled up" toward the 1998 outlier, thereby decreasing the apparent slope of the line.  Second, you have fewer and fewer data points left in the analysis to compensate for that upward pull, making the 1998 outlier more influential on the trend as the dataset gets smaller.  Third, your analysis loses power as the number of data points decrease and becomes less likely to show that a trend is statistically significant, even when trends actually exist.  You can easily see the affect of starting closer and closer to the 1998 outlier in the graph below:


If I plot the calculated trends versus the year each trend starts, I get the following:

Trends versus start year.  Error bars are the 95% confidence intervals.
First, the calculated trend gets lower as one gets closer to 1998 (closer to the outlier) and the confidence intervals around the trend get larger as less data is used in each calculated trend.  Second, all the confidence intervals overlap—there is no point as yet where one can safely say that the trend has actually changed.  Third—and somewhat surprising—even the warming trend since 1998 is statistically significant.  Guess deniers will have to find another start point for their "no warming since..." claims.

In short, all a claim of "no warming" shows is that the person making it either a) doesn't understand how global warming works and/or b) doesn't understand statistics.  For many such individuals, I'd guess that the real reason is both a and b.

Tuesday, September 16, 2014

WUWT and how NOT to test the relationship between CO2 and temperature

WUWT published a piece by Danle Wolfe which purports to measure the correlation between CO2 and global temperature.  As you can probably predict, Wolfe's conclusion is that there is no relationship.
"Focusing on the most recent hiatus below, both visually and in a 1st order linear regression analysis there clearly is effectively zero correlation between CO2 levels and global mean temperature."
 Unfortunately for Wolfe, all he's produced is a fine example of mathturbation as well as an example of forming a conclusion first then warping the evidence to fit.

What Wolfe did was cross-correlate GISS land temperature data and Mauna Loa CO2 records, with two vertical lines dividing the plot into three sections.  The first section is marked "~18 years", the middle is marked "~21 years", and the last section is marked "~17 years".

Figure 1.  Danle Wolfe's plot from WUWT
Why land temperatures rather than land + ocean temperatures?  We don't know as he failed to justify his choice.  There's one other curiosity about his plot.  We know his plot starts in 1959 as he gave that information.  That would make the first section from 1959 to 1977, the middle section from 1977 to 1998, and the last from 1997 to 2014, which means there's an overlap of 1 year between his middle and last sections.  The problem?  The maximum CO2 levels in 1997 (367 ppmv) does not match the vertical line on his graph.


Figure 2.  Temperatures vs CO2 with loess trend line.
His second line looks to be around 372, a level first crossed in 2001, not 1997.  That makes his last section at most 13 years long rather than the 17 years he claimed.  Furthermore, a loess regression reveals that his lines do not divide the graph into "no correlation" and "correlation" sections as he implied.  His "no correlation" sections are nowhere near as long as he claimed them to be.

The next deception in his graph?  He failed to remove the annual cycle from both the temperature record and the CO2 record before cross-correlating them.

Figure 3.  Seasonal cycles in both CO2 records and GISS temperatures.
Time series decomposition shows that both GISS temperatures and CO2 records have 12-month cycles—and also shows that the cycles are out-of-phase.  This makes sense as the CO2 annual cycle is tied in with the Northern Hemisphere growing season and therefore only indirectly tied to global average temperatures.  Accordingly, the cycles must be removed to get the true relationship.  Just compare the cross-correlation graph without removing the annual cycles with one with the annual cycles removed via a 12-month moving average:

Figure 4.  Scatterplots of CO2 versus temperatures, both with and without seasonal cycles removed.
Just removing the annual cycles via a 12-month moving average removed much of the noise Wolfe depended upon to make it look like there was no correlation.  Even when he tried a moving average to remove the cycle, he failed.  Simply put, a 10-month moving average does not eliminate a 12-month cycle.  You can see that in his graph, especially the CO2 line.  If you want to remove an annual cycle, you must use a 12-month moving average, not a 10-month moving average.

Figure 5.  Ten- versus twelve-month moving averages.  Note that the seasonal cycle is still apparent in the 10-month moving average whereas it is fully removed in the 12-month moving average.
Last, Wolfe failed to account for ENSO, aerosols, solar output, or any of the other non-CO2-related influences on global temperature.  His viewpoint that CO2 must be the only thing that influences global temperature is dead wrong.  There have been several studies over the past decade quantifying and then removing non-CO2 influences on global temperatures via multiple regression (e.g. Lean and Rind 2008, Foster and Rahmstorf 2011, Rahmstorf et al. 2012).  Yet it appears that Wolfe is either ignorant of that work or deliberately ignoring it.

What difference does factoring out the seasonal cycle and non-CO2 influences like El Niño/Southern Oscillation, sulfur aerosols, and solar output make on the correlation between CO2 and global temperatures?  Quite a bit.

Figure 6.  Adjusted GISS temperatures versus CO2 with annual cycles removed.
I added a loess regression line to highlight the trend.  Compare that to Wolfe's graph in figure 1 and my graph in figure 2.  Note the differences?  Once seasonal cycles and non-CO2-climate factors are removed, the correlation between global temperatures and CO2 is clear.

And just for Wolfe: Beyond fudging your second vertical line and  "forgetting" to account for seasonal cycles and climate influences like ENSO, solar output, and sulfur aerosols, you also forgot to account for autocorrelation when you did your regression since 1999.  Hint: There's a world of difference between a white noise model and an ARMA(2,1) model, especially after you take out the seasonal cycle, ENSO, aerosols, and changes in the solar cycle.  In "statistician speak," you only got the results you did because of your sloppy, invalid "analysis."
Figure 7.  Adjusted GISS vs CO2 since 1999 after seasonal cycles are removed
Generalized least squares fit by REML
  Model: GISS ~ CO2
  Data: monthly
  Subset: Time >= 1999
        AIC       BIC   logLik
  -1311.377 -1292.253 661.6886

Correlation Structure: ARMA(2,1)
 Formula: ~1
 Parameter estimate(s):
      Phi1               Phi2           Theta1
 1.3709034   -0.4152224    0.9999937

Coefficients:
                 Value             Std.Error      t-value         p-value
(Intercept) -1.9335047   0.6987245   -2.767192    0.0062
CO2          0.0058194     0.0018289    3.181950    0.0017

 Correlation:
       (Intr)
CO2  -1   

Standardized residuals:
        Min                   Q1                Med                  Q3               Max
-1.54739247   -0.69113533    -0.06509602    0.78913695    1.69592575

Residual standard error: 0.05121744
Degrees of freedom: 181 total; 179 residual

Thursday, September 11, 2014

James Taylor versus relative humidity and specific humidity

It appears that the relative humidity and specific humidity continues to trip some people up.  Yes, I'm thinking of the screed James Taylor wrote on Forbes.com on Aug. 20.  In his article, Taylor trumpets two "facts".  First, that relative humidity has declined and second, that specific humidity isn't rising as fast as global climate models predict.  Since climate models assume that relative humidity has stayed constant, Taylor then claims that models are overestimating global warming.  Unfortunately for Taylor, his "facts" don't check out.

For Taylor's relative humidity finding, he linked to a Friends of Science (FoS) graph.  Friends of Science is an oil-funded astroturf group run by Ken Gregory and is filled with various science denier myths (e.g. no warming since 1998) backed with an artfully drawn graph that show a completely flat trend line since 1996.  Edit: I found out that they used RSS data, which was panned by none other than Roy Spencer back in 2011 for showing false cooling due to increased orbital decay.  The fact that FoS tries to pass RSS off as reliable three years later says much about their brand of "science.").

Back to Gregory's relative humidity graph as Taylor cited it.  As Gregory states in his 2013 report, the data comes from the NCEP Reanalysis I, which is the only such dataset that starts in 1948.  That is important, as one of the main limitations in the NCEP Reanalysis I is the radiosonde data it is based upon was not homogenized before 1973 (Elliot and Gaffen 1991).  Homogenization removes non-climatic factors such as changes in location and instrumentation from the data set.  Without homogenization, you get false trends in the data.  In the case of relative and specific humidity data, changes in radiosonde instruments would change measured humidity.  This effectively means that means any comparison between data before 1973 and data after 1973 is invalid.  Yet that is precisely what Gregory did in his graph.  Even better?  Gregory omitted the 1000 mb (near the Earth's surface) pressure level, showing only the mid-tropospheric (700 mb) on up to the stratosphere (300 mb).  Most of the atmosphere's moisture is found in the lower troposphere, not the upper.  Let's correct Gregory's errors.


Those trends Gregory found are far less impressive when using just homogenized data.  Is it any wonder he wanted to include data that included false trends?  Relative humidity has noticeably decreased for the top of the troposphere (300 mb), which is where the atmosphere is cooling.  Other than that, relative humidity has stayed pretty much constant since 1973.  That decline visible at the 850 mb level?  It's on the order of -0.88% per decade, which is hardly worth noting.  There's also evidence that NCEP Reanalysis I underestimates the humidity of the troposphere (Dessler and Davis 2010), resulting in false negative trends.  In short, the assumption that relative humidity will remain constant is a good assumption, no matter how badly Taylor and Gregory wish it wasn't.

Taylor's second "factual" statement claiming that specific (absolute) humidity hasn't risen rapidly enough is based on his demonstrably false statement about relative humidity and is backed by the same questionable 2013 report by Ken Gregory.  What is questionable about Gregory's report?  Beyond the highly questionable relative humidity graph, the NVAP data Gregory used is somewhat doubtful.  NASA, which gathered the data, has this to say about the NVAP data:

Key Strengths:

  • Direct observations of water vapor

Key Limitations:

  • Changes in algorithms result in time series (almost) unuseable for climate studies

That statement about limitations is found right on the main page of the NVAP website.  You won't find any mention of that key limitation anywhere in Gregory's report.  I highly doubt Taylor bothered to check that such a limitation exists.  Yet its existence eviscerates the entire premise behind Gregory's report.

Gregory's report is also questionable in light of the peer-reviewed literature.  Dessler and Davis (2010) found that reanalysis programs that included satellite data as well as radiosonde data generally found that specific humidity had increased in the upper troposphere.

Figure 3 from Dessler and Davis (2010) demonstrating how four modern reanalyses and the AIRS satellite measurement show that specific humidity in the upper tropospheric has increased, in contrast to NCEP Reanalysis I (solid black line) and in contrast to Gregory's claim otherwise. 
Chung et al. (2014) combined radiosonde and satellite measurements to show that specific humidity in the upper troposphere had increased since 1979 (the start of the satellite record) and that the increase could not be explained by natural variation.

Bottom line?  Gregory's opening sentence which stated his conclusion ("An analysis of NASA satellite data shows that water vapor, the most important greenhouse gas, has declined in the upper atmosphere causing a cooling effect that is 16 times greater than the warming effect from man-made greenhouse gas emissions during the period 1990 to 2001.") is utterly wrong.  Even if there had been a drop, it wouldn't have the size of the effect that Gregory claimed.  Solomon et al. (2010) showed that a drop in stratospheric water vapor, which is even higher in the atmosphere than the top of the troposphere, slowed the rate of temperature rise by only 25% compared to that expected from greenhouse gases alone.  The size of the effect is nowhere near the "16 times greater" that Gregory claimed, certainly not enough to stop global warming.

In short, Taylor based his main premise on a single non-peer-reviewed report from an oil-supported astroturf group, a report that is factually false.  The remainder of his article is just a repetition of science denier canards, many of which I've covered in this blog before, and most of which merely demonstrate that Taylor has an abysmal grasp of climate science.

Monday, September 8, 2014

R code for my Seasonal Trends graph

I had a request for the code I used to generate the graphs in my Seasonal Trends post.


While it looks complex, the R code for it is very simple IF you have the data ready.    I'm assuming that you already have the temperature dataset you want as an R object (I have several datasets in an object I simply call "monthly": GISS, Berkeley Earth, Cowtan-Way, HadCRUT4, UAH, and RSS, along with the year/decimal month Time, Year, and numeric Month).  The code I used to generate the graph is as follows:
#Create separate datasets for each season

monthly.79<-subset(monthly, Time>=1978.92 & Time<2013.91)
DJF<-subset(monthly.79, Month=="12" | Month =="1" | Month=="2")
DJF$Year_2 <- numeric (length (DJF$Year))
for (i in 1:length (DJF$Year) ) {
        if ( DJF$Month [i] == 12) {
                DJF$Year_2[i] <-   DJF$Year [i] + 1
        }
        else {
                DJF$Year_2[i] <-   DJF$Year [i]
        }
}
MAM<-subset(monthly.79, Month=="3" | Month =="4" | Month=="5")
JJA<-subset(monthly.79, Month=="6" | Month =="7" | Month=="8")
SON<-subset(monthly.79, Month=="9" | Month=="10" | Month=="11")

#Calculate the seasonal average for each year

DJF<-aggregate(DJF$BEST, by=list(DJF$Year_2), FUN=mean)
MAM<-aggregate(MAM$BEST, by=list(MAM$Year), FUN=mean)
JJA<-aggregate(JJA$BEST, by=list(JJA$Year), FUN=mean)
SON<-aggregate(SON$BEST, by=list(SON$Year), FUN=mean)

#Check for autoregression

library(forecast) #for the auto.arima function

auto.arima(resid(lm(x~Group.1, data=DJF)), ic=c("bic"))

auto.arima(resid(lm(x~Group.1, data=MAM)), ic=c("bic"))

auto.arima(resid(lm(x~Group.1, data=JJA)), ic=c("bic"))

auto.arima(resid(lm(x~Group.1, data=SON)), ic=c("bic"))

 #Construct the plot

plot(x~Group.1, data=DJF, type="l", col="blue", xlab="Year", ylab="Temperature anomaly (ºC)", main="Seasonal Climate Trends", ylim=c(-0.1, 0.8))
points(x~Group.1, data=MAM, type="l", col="green")
points(x~Group.1, data=JJA, type="l", col="red")
points(x~Group.1, data=SON, type="l", col="orange")

#Add the trend lines

abline(lm(x~Group.1, data=DJF), col="blue", lwd=2)
abline(lm(x~Group.1, data=MAM), col="green", lwd=2)
abline(lm(x~Group.1, data=JJA), col="red", lwd=2)
abline(lm(x~Group.1, data=SON), col="orange", lwd=2)

legend(1979, 0.8, c("DJF", "MAM", "JJA", "SON"), col=c("blue", "green", "red", "orange"), lwd=2)
#Get the slopes

summary(lm(x~Group.1, data=DJF)
summary(lm(x~Group.1, data=MAM)
summary(lm(x~Group.1, data=JJA)
summary(lm(x~Group.1, data=SON)
 That's all there was to it.  I just repeated this code, modifying only the period of the initial subset to create the second graph (Monthly.2002<-subset(monthly, Time>=2001.92 & Time<2013.91) and the related seasonal subsets.  To the person who requested my code: Hope this helps.

Monday, September 1, 2014

One hundred years ago today...

...the last passenger pigeon, a female named Martha, died in the Cincinnati Zoo.  A species that once had an estimated population size of 3 billion was destroyed in roughly 50 years by a combination of habitat loss and overhunting.  The story of that extinction is being told in numerous articles on this centenary (i.e. in Nature, Audubon Magazine, National Geographic) and at museums like the Smithsonian Institute which tell the story far better than I could here.  The Audubon Magazine article, in particular, is well worth reading as it details the history of the extinction.

So, could such an extinction happen today?  The sad answer is yes, that it not only could but that it is happening today.  We see the same market incentives to get every last remaining individual that doomed the passenger pigeon in the case of the bluefin tuna, whose Pacific population has collapsed to just 4% of what it was in the 1950s.  The protection of the little that remains has been held up by fishing interests, mainly in Japan, Mexico, the US, and South Korea.

The illegal ivory trade claimed an estimated 100,000 African elephants since 2010, which is considered to be a gross underestimate by some experts.  The last large survey of elephant populations in 2007 placed the total number at only 472,000 and 690,000 elephants remaining in Africa.  Jones and Nowak wrote that they suspected that the current population was around half of that number.  At current rates, many areas of Africa will lose their elephants.

The Western Black Rhino, a subspecies of black rhinoceros, was recently driven to extinction by habitat loss and overhunting, both for sport and to meet demand for powdered rhino horn in traditional Chinese medicine.

In the US, measures to protect the lesser prairie chicken were held up for years due to opposition from various economic interests.  This despite losing 86% of their habitat.  What finally broke the logjam was a population crash that saw a 50% reduction in population in just one calendar year (2012-2013).  The population now hovers around 17,000 birds.

These are just a few of the many examples I could cite from around the globe.  Habitat loss and overhunting still play a major role in extinctions today, just as they did 100 years ago.  And just like 100 years ago, market forces still overwhelmingly favor those who try to kill every single last individual.  Now we get to add climate change to the mix, which is predicted to have major impacts on extinction rates in coming decades.  You would hope that by now, we would have learned our lesson from the passenger pigeon.  Unfortunately, we as a society appear to be trying to prove that George Santayana was correct when he said "Those who cannot remember the past are condemned to repeat it."

Sunday, August 31, 2014

The Daily Fail: David Rose's newest cherry-pick.

David Rose, who is no stranger to cherry-picking climate data and then weaving artful tales based on those cherry-picks, is back with yet another example of his perversity.  This time, he's trumpeting a 2-year increase in Arctic sea ice as measured on a single day: August 25, 2012 vs. August 25, 2014, claiming a 43% increase based on those two very specific days.  This is misleading for multiple reasons, one of which he himself admits in small type under that large flashy graphic at the top of his article:
"These reveal that – while the long-term trend still shows a decline – last Monday, August 25, the area of the Arctic Ocean with at least 15 per cent ice cover was 5.62 million square kilometres." (emphasis added).
So, just what does that long-term trend show? This:


To generate this graph, I simply downloaded the NSIDC Arctic sea ice data and plotted August 25 sea ice extent. When data for August 25 was missing, I simply averaged the data for August 24 and August 26 to extrapolate the extent on August 25.  I've added a loess smooth and 95% confidence intervals to highlight the trend.

Is it any wonder why Rose wants to focus on the increase since 2012 while giving little more than lip service to the long-term trend?  The long-term data shows that August 25, 2014 had the 8th lowest extent on that date of the satellite record.  The long-term trend?  Still firmly negative.  So why the focus on the increase since 2012?  Simple.  It's the part of the satellite record that he could spin to fit his narrative.  It wouldn't fit his "all is well" narrative if he admitted that the Aug. 25, 2014 extent represented a loss of 1.9 million square kilometers from the same day in 1979.  It's the equivalent of a gambler claiming an increase of 43% when he wins back $17 while ignoring the fact that he lost a total of $36 before his win.

He also claims that the ice is thicker and implies that ice volume has recovered but didn't show the data on that, merely relying on the colors in the satellite image while utterly ignoring the trend in ice volume.  Examining the data shows why.


The take-home message from the volume data?  That the ice has "recovered" enough to get back up to the trend.  The last few years, volume had been declining faster than the overall trend.  Now?  It's right at the trend.  Not quite the picture of a "recovery" that Rose attempts to paint.

I'll leave the final word to Dr. Ed Hawkins, who Rose quoted near the end of his article.
"Dr Hawkins warned against reading too much into ice increase over the past two years on the grounds that 2012 was an ‘extreme low’, triggered by freak weather.

‘I’m uncomfortable with the idea of people saying the ice has bounced back,’ he said.
"
 That is a hilarious quote for Rose to include, as Rose spent his entire article trying to do just what Hawkins warned against.