Homogenizing the World

A few weeks ago I put up a post on how the homogeneity adjustments applied by GISS to raw surface temperature records increase warming at the hemispheric and global scale. In this post I extend the review to include the homogeneity adjustments applied by NCDC, CRU and BEST and re-evaluate GISS using a series which is more relevant than the “meteorological station only” series I used last time. Here is a summary of results:

  • BEST applies homogeneity adjustments to the raw Northern Hemisphere land surface air temperature records that add 0.3-0.4C of warming since 1890. The adjustments applied by NCDC, CRU and GISS, seem to be limited to a few countries such as Iceland and the USA and add no significant warming.
  • All series apply homogeneity adjustments to the Southern Hemisphere raw records before about 1970 and each series gets different results, adding anywhere from 0.3 to 0.6C of warming. There is little doubt that this warming is introduced by the adjustments. The scatter between the series further demonstrates that in practice homogeneity adjustments do not homogenize the raw data; they simply introduce additional distortions.
  • However, the impacts of Southern Hemisphere homogeneity adjustments on global warming estimates, which are commonly quantified from “surface temperature” series such as HadCRUT4, are diluted by a factor of about ten because HadCRUT4 is an area-weighted average of air temperatures over land and SSTs over the oceans, and land in the Southern Hemisphere covers only about 10% of the Earth’s surface. Because of this dilution I estimate that the homogeneity adjustments applied to CRUTEM4, HadCRUT4’s component “land” series, add only ~0.03C of global warming since 1890. (BEST adds the most, at ~0.1C)
  • Of the series reviewed BEST appears to be the least representative of actual land surface air temperature trends.

Five published surface air temperature series are considered here. All of them have been subjected to some degree of homogeneity adjustment:

  1. The NOAA/NCDC land series, which uses several thousand records and which covers land areas only.
  2. The UEA/CRU CRUTEM4 land series, which also uses several thousand records and covers land areas only.
  3. The Berkeley Earth BEST land series, which covers land areas only but reportedly uses some tens of thousands of records.
  4. The GISS 250 series, which projects data up to 250km out over the oceans and which also uses several thousand records. GISS does not publish an official “land” series but I made GISS250 into one by applying a land mask.
  5. The GISS met (meteorological station only) series, which uses the same set of records as GISS250 but is not a land series because it projects temperatures well out over the oceans. I include this series because it was the one used in the previous post.

These series are compared with two hemispheric and global series I constructed from scratch some years ago using unadjusted GHCN v2 records, which reviews show are very similar to the current GHCN v3.2 unadjusted records. The series are:

The Verified series runs from 1890 to 2010. It uses 801 unadjusted GHCN v2 records (460 in the Northern Hemisphere) that I verified as correct to within acceptable limits by matching them against adjacent records. This series is not strictly a “land” series because it projects temperatures over the oceans, although except for a few short records from moored weather ships all the data are from land stations.

The RA series runs from 1880 to 2005. It adds 71 records of questionable quality to fill in gaps in the Verified series and estimates annual means by a first-difference approach that is effectively the equivalent of taking a mathematical average of all the records. This should make it a reasonably close approximation to a “land” series:

Figure 1 compares these two series in the Northern Hemisphere. The series are plotted as anomalies relative to 1990-2005 means, as are all the series presented here unless otherwise specified. Using 1990-2005 as the baseline period causes differences between the series to appear in the early rather than the later years:

Figure 1:  Unadjusted RA and Verified Series, North Hemisphere

The two series are effectively the same. Changing the estimation approach and adding 71 dubious records makes no significant difference.

The question now arises, are these two series reliable? Or not to put too fine a point on it, do I know what I’m doing? Figure 2 superimposes four of the five published homogeneity-adjusted NH land surface air temperature series on the RA series. The Verified series is omitted to improve clarity and the BEST series is shown later:

Figure 2:  Unadjusted RA vs. NCDC, CRU, GISS 250 and GISS Met homogeneity-adjusted land surface air temperature series, North Hemisphere

The four adjusted series show substantially the same trends and substantially the same amount of overall warming as the unadjusted RA series, indicating:

a) that the adjustments these series apply to the raw records do not add any appreciable amount of warming in the North Hemisphere.

b) that the number of records used, which ranges from a few hundred to several thousand, makes no significant difference.

and hopefully confirming:

c) that I know how to construct temperature time series.

Figure 3 now plots BEST against the average of the other published series. BEST is the odd man out, showing 0.3-0.4C more warming in the Northern Hemisphere since 1890:

Figure 3: BEST versus average of NCDC, CRU, GISS 250 and GISS Met series

Now on to the Southern Hemisphere. Figure 4 compares the verified and RA series. Verified uses 341 unadjusted records and RA adds another 27, including some that show urban warming signatures, such as Buenos Aires, Rio, Sao Paulo, Sydney and Melbourne. Verified also area-weights the results while RA gives disproportionate weighting to Australia, where most of the long-term SH records are. Yet there is still a reasonably good match between the two series after 1900, with neither showing any appreciable warming between 1900 and 1975. (The mismatches before 1900 are a result of a shortage of good long-term SH records. Results depend on which records you accept and which you reject.)

Figure 4:  Unadjusted RA and Verified Series, South Hemisphere

Figure 5 now superimposes the RA series on the five published and homogeneity-adjusted SH land series (Verified is again omitted but BEST is included):

Figure 5: Unadjusted RA vs. NCDC, CRU, GISS 250, GISS Met and BEST homogeneity-adjusted land surface air temperature series, South Hemisphere

We see that the best efforts of NOAA, CRU, GISS and BEST to homogenize the Southern Hemisphere raw records have created a hodgepodge of conflicting series that frustrates all attempts to make a robust estimate of Southern Hemisphere warming before about 1970. Clearly the homogenization algorithms have not homogenized. They’ve added noise instead.

However, one thing the homogenization algorithms do agree on is that the RA series is cooling-biased (they all add warming). But how do they arrive at this conclusion? It’s hard to find supporting evidence. RA matches the published Northern Hemisphere series since 1880 (Figure 2) and the published Southern Hemisphere series after about 1970 (Figure 5), so why should it be cooling-biased in the Southern Hemisphere before 1970, considering that it was constructed using the same procedures? Some of the 368 records RA uses over this period, such as Cape Town, Anatananarivo and Adelaide Airport, may indeed be cooling-biased, but some are warming-biased, such as Buenos Aires, Sao Paulo and Low Head, and deleting suspect records of this type makes no significant difference. And in specific examples discussed in previous posts, such as Alice Springs and the Paraguayan records, the GISS and NCDC homogenization algorithms have unquestionably added non-existent warming. The conclusion has to be that the warming added in the Southern Hemisphere is an artifact of warming-biased homogenization algorithms, although exactly how they do this remains unclear.

Now on to the global series. Figure 6 compares the RA and Verified global series. Again the match is good. (To analog a land series RA is calculated as Northern Hemisphere times 2/3 + Southern Hemisphere times 1/3 to allow for the fact that there is approximately twice as much land area in the Northern Hemisphere:)

Figure 6: Unadjusted RA and Verified Series, Global

Figure 7 now compares the global RA series against the published global series. The larger NH landmass mutes the impacts of the SH homogeneity adjustments when the hemispheres are combined:

Figure 7: Unadjusted RA vs. NCDC, CRU, GISS 250 and GISS Met homogeneity-adjusted land surface air temperature series, Global

We will now assume that the RA global series is correct and proceed to make some quantitative estimates of how much “global warming” homogeneity adjustments have added. The table below lists how much surface air temperature warming the different series show over different periods, calculated as the difference between ten-year means (trend lines can be misleading when the trend is not linear, as is the case here). All series show similar amounts of global warming between 1970 and 2000 (calculated as the 1996-2005 mean minus the 1966-1975 mean) but not between 1895 and 1970 (calculated as the 1966-1975 mean minus the 1890-1899 mean):

The “excess warming” column at right shows how much global warming the adjusted series have added since 1895 over and above the 0.88C shown by RA. Since this excess warming applies only to land areas, however, only 30% of it contributes to the total warming shown by the land and ocean series that are commonly used to measure global warming, and subtracting this 30% from these series makes little difference. Quantitative estimates are:

  • NCDC Merged Land-Ocean shows 0.08C less overall global warming since 1880
  • HadCRUT4 shows 0.03C less overall global warming since 1880
  • The GISS Land-Ocean Temperature Index (LOTI) shows 0.04C less overall global warming since 1880
  • BEST Land + Ocean shows 0.10C less overall global warming since 1880.

So there you have it. Homogeneity adjustments applied to surface air temperature records add only 0.03C of warming to HadCRUT4, the series the IPCC uses and the one most commonly used to quantify global warming. So we have all been waving our arms about nothing, right?

Well, not exactly.

First comes the question of why NCDC, GISS, CRU and BEST continue to claim in the face of a mounting body of evidence that their homogeneity adjustments are valid, but I don’t intend to go into that here.

Second is the question of how to measure global warming. At the risk of repeating myself yet again, HadCRUT4 and the other series that combine air temperatures over land with SSTs in the oceans do not give meaningful estimates (I use HadCRUT4 as an example in this post only to conform with common usage). Surface air temperatures and SSTs exhibit different trends and must be considered separately, which makes the lack of surface air warming in the SH before 1970 a matter of importance because climate models can’t replicate it.

Third is the fact that the emphasis on surface air temperature adjustments has diverted attention from the main problem – the “bias” adjustments applied to the SST data, which are at least as suspect as the air temperature homogeneity adjustments and have a much larger impact on our perceptions of “global warming”, but that is a separate issue.

Finally a word on BEST. The results shown above indicate that BEST is overall the farthest from reality of the four adjusted surface air temperature series considered. I don’t know exactly why this should be but suspect it has something to do with the fact that in addition to homogenizing the data BEST also uses all records within a radius of 2,000km (representing an area of 12.6 million sq km, not that much smaller than Russia) to estimate mean temperatures in a grid block. This approach does not make much difference when there are stations in or close to the grid block because the temperatures are distance-weighted (I believe using kriging weights), but when there are none temperatures may be projected into the block from many miles away. A common result is to find grid blocks in the middle of nowhere with temperature time series that go back well into the 19th century. Some BEST grid blocks in the middle of the Sahara Desert, for example, have scattered data going back almost to 1790. I have yet to find which station these data came from, but it certainly isn’t in the Sahara Desert.

This entry was posted in Climate change and tagged , , , , , , , . Bookmark the permalink.

56 Responses to Homogenizing the World

  1. Dave Rutledge says:

    Hi Roger,

    Thank you for the detailed analysis. Calculation based on first differences would not eliminated urban heat islands, correct?


  2. Retired Dave says:

    Thank you Roger. A very thorough and honest analysis.

  3. sam Taylor says:


    When you end up with a significantly different result than 5 independently produced data series, it’s a bit of a leap to assume that you’re correct and everyone else is wrong. Some statistical analysis would be nice too. I suspect that euans result in his post about Iceland isn’t statistically significant. Error bars are important.

    • Euan Mearns says:

      Roger shows that his series match 3 others pretty exactly in the N hemisphere and that no series match each other in the southern hemisphere.

      • Sam Taylor says:


        Roger’s is persistently the warmest in the southern hamisphere, by a fair bit in some regions.

        I’m currently trying to replicate your work on Iceland so that I can do some statistical analysis on it. Could you tell me which v3 dataset you used. You appear to have at least 2 stations active from 1880 in your v3 data, while I can only see one. I think that you’ve got Teigarhorn shifted about 4 years back.

        • Euan Mearns says:

          Sam, the stations I used as are in Figure 1. But you are correct, I’ve made a mistake with Teigerhorn – explains why it was all over the shop in Figure 10. So I’ll have to add an errata. But I don’t this will make much difference.

        • Euan Mearns says:

          Sam, thanks for picking that up. I’ve had a quick look (in panic 🙁 What it does is completes the band of data deletion 1960s and actually makes things look worse. I’m happy for you to cross check, there’s so much data its easy to make mistakes. Goodluck with cross editing V3.1.

          I’m away to walk my dogs and will replace the necessary charts later today.

          • Sam Taylor says:

            Easy mistake to make. I made a similar one myself when I put a v3 time series into v2 by mistake. I should really try to write a script to automate this. I don’t think it makes that huge of a difference, as you say. I’m going to have to dig around in some old statistics notebooks to remember how to do some of what I’m after.

          • Euan Mearns says:

            Sam, if you find any other mistakes please let me know. I’m working with another commenter on comparing GHCN to the IMO records and suspect we will have a series of posts on this. If you turn up anything with stats that we can agree on I will of course post that too. One statistical question to ask is whether larger than normal adjustments have been made in the narrow band around 1939. Data adjustments around the 1960s cooling are actually more complex than I thought. Removing my mistake actually straightens out a lot of the data, but we now have a new temperature peak in 1964!

            Its very good that more people look at this and I hope one thing we can agree upon is that the data curation is a mess. And now that I’m looking at the IMO records it has become a mess cubed.

    • Sam: it’s a bit of a leap to assume that you’re correct and everyone else is wrong.

      Please take another look at Figure 5 and tell me who is right

  4. Yvan Dutil says:

    I think you see a coverage issue. As the coverage increase, the average temperature rise tend to increase because the Arctic get more coverage.

    • Yvan: I think it would help if you read up on the difference between absolute temperatures and temperature anomalies.

      • Yvan Dutil says:

        The point is that temperature increase faster in the Arctic and the in mountain range. Those regions tend to have shorter period of observation. If you select long duration site, you miss those.

        This effect was observed with the INternatinal surface temperature initiative data.

  5. William says:

    Roger, have you printed a list of stations and their coordinates somewhere? Or a map? Publishing the stations, their data and the spreadsheet would help us to replicate your work – and replicated it would be much stronger.

  6. A C Osborn says:

    Roger, there is something wrong with the Northern Hemisphere analysis.
    The reason that I say this is because your findings do not match the NCDC/GHCN/GISS Declared effect of their Homogenisation.
    TOBS alone should show at least a 0.35C warming as stated by Zeke here

    It would be interesting to see the difference between the Real USA data shown here (not up to data) with the other data sets.

    • A C Osborn says:

      That WUWT link should go to the USCRN graph here

      Which is the new USA weather station network that does not need any adjustments.

    • AC. As far as I have been able to tell – and I haven’t checked all X thousand records – homogeneity adjustments in the NH have been applied only in the US and Iceland. The US and Iceland make up only a small percentage of NH land area, so you don’t see them in the hemispheric series.

      • Euan Mearns says:

        Roger, what do the raw unadjusted records for the USA show?

      • A C Osborn says:

        Roger, GISS applies adjustments across europe as well.
        see this as an example.

        You can see the actual Valentia Observatory Real values here

        compared to Pheonix Park

        • AC:

          The ~0.5C of warming added to the Valentia record will add ~0.0002 C of warming to the NH land surface air temperature time series assuming that the Valentia record covers a 2×2 degree grid block.

          My idea in writing this post was to provide an objective assessment of how much warming surface air temperature adjustments add to the combined land+ocean series that are commonly (and again I will add incorrectly) used to quantify global surface warming. And the answer, as you Brits would say, is bugger all. The Earth’s surface really has warmed over the last 100 years or so. There’s no way of getting round it.

          But even if the adjustments don’t cause the warming a good case can still be made that the lack of warming in the SH before ~1970 is what causes the SH adjustments.

          • A C Osborn says:

            You miss my point Roger, I think you will find that practically all the NH Station records have been adjusted from their original data, just like Iceland and Valentia. It may not show up GHCN Raw, because the original data was already adjusted by the host country’s Met Office.
            We have previously seen complaints from Sweden, Norway and Russia that GHCN have changed their data.

          • Euan Mearns says:

            Roger, your data and mine that is coming shows little warming on S hemisphere land. What chance that the southern SSTs warm but the land not?

          • I think you will find that practically all the NH Station records have been adjusted from their original data, just like Iceland and Valentia.

            No, I haven’t found that at all. I just checked another record at random – Verhojansk – no net adjustment there either. Give you a freebie though – Lihue in Hawaii, although the land area covered is negligible.

          • Euan Mearns says:

            Roger, the fact that virtually all S hemisphere records, USA and Iceland have been adjusted but nowhere else sounds fishy to me. Do you have a theory?

            I’d add that from what I observe, the mass adjustments to central Australia and southern Africa make very little difference. WTF is going on? I have a theory 😉

          • Euan: Rather than answer that question directly and have people accuse me of being a conspiracy theorist, I’m beginning to think the problem lies with what BEST calls a “temperature expectation” series and what the others as I recollect call a “reference series”. These are series that show what the homogenization algorithms (or the people who developed them) think the temperature series in a particular area should look like before the adjustments are applied, and the raw records are then adjusted to fit these series +/-. The problem with them is that if the raw records don’t show much in the way of warming the expectation/reference series will usually (although not always) manufacture some. The result is that raw records in the NH, many of which already show warming, tend to get left alone while raw records in the SH, many of which don’t, tend to get warming-adjusted. Maybe a post on this later if everyone isn’t homogenized out by then.

  7. Javier says:

    Great job, Roger. Thanks a lot.

    I think that finally we know what is going on with the temperatures. It does certainly make sense on the light of the abundant evidence on data adjustment, including recent articles from Euan.

    Iceland appears to be an exception within the Northern hemisphere.

    I wonder if the cooling of the Southern Hemisphere past is politically motivated. Most countries in that hemisphere should be in the receiving end of carbon compensation and would not be pleased that they are excluded from the global warming party.

  8. Pingback: The Hunt For Global Warming: Southern Africa | Energy Matters

  9. Pingback: AWED Energy & Environmental Newsletter: March 9, 2015 - Master Resource

  10. Pingback: Recent Energy And Environmental News – March 9th 2015 | PA Pundits - International

  11. William says:

    Roger, you posted your spreadsheet a while back. I was expecting to find some sort of grid system or some method of accounting for the area represented by each selected station, but there is none of that. Just simple averages of all stations.

    I reproduced your graphs: I assumed that the data on the 1st sheet were correct and reproduced the 2nd sheet from that. I get the same graphs that you published. Okay, big deal.

    Taking it a bit further, I went through the southern hemisphere stations listed in your spreadsheet and plotted a distribution of the points (using http://www.darrinward.com/lat-long/). The east and west southern hemispheres are shown in http://s23.postimg.org/ty9l6ckez/Allpoints_east.png and http://s16.postimg.org/da6xeaa8l/Allpoints_west.png (Note that the lat-long plotting script at darrinward.com hangs if given too wide a range of longitudes so I plotted east and west separately).

    As is shown in the plots, there are a lot of stations, some heavily concentrated, some covering wide areas. A surprising number of stations are on specks of land in the oceans. Does a land-only index normally cover these islands? Clearly they are land, but without area weighting their contribution to the hemisphere average is exaggerated.

    While extracting the lat/long of each station, I also categorized them as good, bad and in-between. My categorization is arbitrary: good stations were generally those that don’t have any obvious step changes or gaps. Bad ones have clear step changes and/or gaps. In-betweeners don’t qualify as good but were not clearly bad. All very vague – the quality of stations is hugely variable and classifying them consistently is difficult.

    I plotted the resulting SH points that I classed as good http://s11.postimg.org/ts2iehz6b/Goodpoints_east.png and http://s4.postimg.org/rwyjimvx9/Goodpoints_west.png

    As you can see there are only a third as many. South western Africa is empty but otherwise there is coverage in most places. The resulting temperature index is shown here: http://s11.postimg.org/bw2p7pqf7/Good_only_plus_coverage.png

    This shows much more SH warming. The NH line is unchanged from Roger’s plot, the global line uses only the good SH points and all of the original NH ones. Also shown are toy trend lines (3rd order polynomial) and a coverage line below zero (scaled so that 0 represents 118 stations and -0.5 represents 0 stations).

    Does this prove anything? It shows that station selection matters. Who knew? There are fewer station records in my index but they might arguably be more evenly distributed – the original distribution had high concentrations of stations in western South America, eastern Australia and south-eastern Africa plus many tiny-island stations. I think mine is probably better and can’t see why adding lots more probably unreliable stations (the ones I omitted) should improve matters. But on the other hand my selection criteria might well be biased, as might Roger’s.

    My conclusions, for what they are worth:

    1. To create an index from “raw” data you really do need to consider whether that data is any good. Raw data that is obviously wrong is useless unless it can be corrected.

    2. Comparing an index created from corrected data with one created from “raw” data is a misleading comparison if the raw data can be shown to be bad.

    3. Computing an index without taking into account the area represented by each station will give inaccurate and misleading results.

    • A C Osborn says:

      Having seen how there are real definite Step Changes in Climate I am not sure that it is valid to rate a station as “Bad” if it has a one or more step changes, especially if any other stations around it show the same step, especially if the stations are anywhere near the Sea or Mountains..
      Obviously if you can identify something like an instrument change at the step point then you could split the station.

      Why do you need to know what Area is represented by a station, it is measuring a single point on the surface and does not and cannot represent anything other than that. Just to illustrate what I mean take a look at this study of 3 stations within 1.4km of each other.

      I also do not understand your point 2, basically you are saying that NASA/NCDC/BEST are doing the wrong thing. because the whole point of their “Corrections” is make Bad Stations Good.

      In your temperature chart the Globe follows the NH values, is that a factor of Quantities of Stations?
      As it appears that the much lower SH temperatures have no affect on the Globe.

      • William says:

        A big step like that in Portland (38.4 S,141.6 E) at 1900, where everything before is greater than nearly everything after, looks to me to be invalid. That is subjective, I agree. I don’t doubt that you can look through my list and say maybe some stations that I say are bad are actually okay and others that I say are okay are not. The same could doubtless be said of Roger’s original selection of stations from the hundreds (or thousands?) that he rejected. My judgement of what was good and bad was also not constant throughout – it took a long time and my judgement and patience certainly varied. Sorry!

        Area certainly matters. Are there any indices that don’t take it into account? Say you have an region where temps go up 1 degree in the west over a period and down 1 in the east. If we have 1 climate station in the east and 3 in the west and simply average the readings, the apparent change is biased upwards by the excess of stations in the west. If instead we averaged the 3 westerly stations first to get a single western value to match the single in the east, then averaging these two gives no area bias.

        By point 2, I mean to judge the quality of something, you need a reference of known quality against which to measure. If you are planing a piece of wood to be flat and square you don’t measure it against one your apprentice did earlier, but against a known square and a known flat edge. If judge your homework according to whether it differs from mine, you will draw the wrong conclusions unless you know for sure that mine is correct. It makes no sense to judge a temp index according to its divergence from one computed from data known to contain errors.

        The fact that the global plot is closer to the NH curve in my graphs than in Roger’s is because the number of SH stations has decreased. And this has an effect because area matters – see above.

    • William: Thanks for the effort you have put in on this. It’s appreciated.

      I’ll be replying at length when I’ve had a chance to go through what you’ve done. But in the meantime could you please clarify what the lines on your “temperature index” plot signify? It’s a little difficult to tell which is which from the color scheme and I’m not sure which of them is the one showing “much more SH warming”.

      If you could supply a list of your “good” SH stations that would be helpful too.

    • Euan Mearns says:

      William, I’m away on a kind of vacation. Back home in a couple of days. I too really appreciate this comment. But there are a number of points that are put up for serious debate. Our different views on this issue gives us different biases, and that makes it difficult for either of us to win the argument. We each need to go off and find the clinching logic to win that argument.

      1) How do we tell a good record from a bad one?
      2) Can bad records be corrected? I’d argue they should be discarded in favour of only using good records.
      3) Should records be area weighted? I’d argue no. Groups of good records should show same trend and the group should be area weighted after discarding bad records. This I believe was Roger’s original approach but Roger and I have yet to agree criteria for what is good or bad.
      4) Should machines be used to do this? Absolutely not IMO. There seems to be thousands of lazy B*rds out there not bothering to look at all that “raw” data.

      My approach at present is to throw out urban records that show warming. Wide open to criticism I know.


      • How do we tell a good record from a bad one?

        A good record matches the records around it. A bad record doesn’t. (You’re a geologist; think check assays).

        Can bad records be corrected? I’d argue they should be discarded in favour of only using good records.

        So would I.

        Should records be area weighted?

        If you have enough good records it doesn’t make any difference whether you area-weight or not. If you don’t have enough good records – well, you don’t have enough good records.

        Should machines be used to do this? Absolutely not IMO.

        There’s nothing wrong with machines. They can be great time-savers if you use them properly. The problem is with the people who use them.

        • Euan Mearns says:

          My question was for William 😉

          Appendix 2: Congruous Temperature Trends

          The UK provides a good example of congruous temperature trends. ALL records display the same structure over a large area. Lerwick on the Shetland Islands is 1200 kms north of Southampton on the south coast of England. If a UK climate record did not conform to this regional trend there would most likely be something wrong with it.

          Figure 12 Tmax, 5y running averages for 23 UK stations.

    • Roger Andrews says:


      Thanks for the list. Some graphs to look at and comments to consider.

      To make sure I understood what you did I recalculated the SH series using only the records that you had labeled “OK” and superimposed it (red line) on your blue line. The match is pretty good. My series shows slightly less overall warming presumably because you area-weighted the results and I didn’t. (Area weighting usually makes little difference. We see this in the “Verified” versus “RA” plots in the post, which compare area-weighted series with mathematically-averaged series.)

      The next graph compares my replication of your series with the series I calculated from the non-OK records you rejected, which amount to 64% of all the data (are the SH records really that bad?). It shows that you have preferentially rejected records that show cooling. This would be acceptable if there were solid reasons to believe that records that show SH cooling are intrinsically less reliable than those which show SH warming, but I don’t know of any.

      Your SH series still shows slightly less warming than GISS250 and CRU and a lot less warming than GISSmet, NCDC and BEST. (Note again that “William” is my replication of your series.)

      Finally on the question of island records being overweighted because of their small land area. A simple way of checking how much distortion this might cause is to throw the ~60 SH island records out (i.e. give them a land area of zero) and see what happens. And the answer is – effectively nothing, which isn’t surprising because air temperature trends over land and ocean areas are substantially the same:

      Over to you.

      • William says:

        Euan’s Q’s

        1. Telling good from bad is difficult, I agree. I think it should be possible to agree that some records are clearly bad and some are clearly good. There’s a lot in between that I spent a long time vacillating over and often just plumed for “medium”.

        2. Making bad records good is what homogenization is all about. There are many good people working on that who are certainly far more knowledgeable and probably also more able than me. They seem to think it is possible.

        3. As I explained to AC, omitting area weighting introduces a bias. There’s no getting around that. So if the aim is to create an index representative of the globe that exists where stations are irregularly spaced (as opposed to an imaginary world in which stations are equally spaced) it is unavoidable. That doesn’t mean that toy indices like those we create here are not fun or interesting, but they are not comparable with area-weighted ones.

        4. As for algorithmic index creation, I see no reason why not. UAH and RSS use models and automated processing to turn their myriad measurements into a single global temperature number. And as I mentioned some time back, the “after combining sources at the same location” option of GHCN v2 uses an automated method.

        BTW your UK congruous records plot is striking. I’ll try to do one myself.

        Roger, I didn’t area-weight. I don’t know how to do it and it seems a non-trivial exercise. Do you want to share your “verified” spreadsheet (if that is the area weighted one)? It is what I was after in the first place as I was curious how to do area weighting/masking that you had referred to.

        I don’t recognize your curve showing “rejected” records. If I average all non-good (i.e. “ok”) records in my list, assuming I didn’t screw up, the curve that results is: http://s28.postimg.org/ubr7ukpyl/SH_non_good.png

        As for my SH curve not following the others, I’m happy that my toy index is even in the same ballpark. My index has at least two sets of biases, my selection criteria overlaid on top of yours (I considered nothing that you hadn’t already preselected). It was interesting.

        Do you have a list of island stations – as used in your last graph (which seems to use the RA series – is that what I have?)

        • William:

          On area-weighting. My Verified series uses what I identify as “good” raw records to divide the Earth into 64 zones with different temperature trends, ranging In size from 100,000 to over 3 million square miles and extending over both land and ocean areas. Then I calculate the series for each zone by averaging the records in the zone and weight it by the area of the zone to obtain global and hemispheric means. It took a lot of time to put Verified together and I can’t easily summarize it on a spreadsheet. A full description of what I did would require several pages of explanatory text and scores of graphs.

          Yet despite all the effort I put into Verified I get pretty much the same results from RA, which is just a crude mathematical average of all the records (see Figs 1, 4 and 6 in the post). Had I known this before I started I could have saved myself a lot of work. But it does show that area-weighting is not a major source of bias.

          So what is? Record selection and adjustment, obviously. And in the context of adjustments I note once more that even after throwing all your “non-ok” records out the SH series still shows a lot less warming than NCDC and BEST.

          I don’t recognize your curve showing “rejected” records. That’s because it’s wrong. Your curve is right. I blew it. 🙁

          I’m happy that my toy index is even in the same ballpark. Why shouldn’t it be? You’re just averaging the data, and given the same set of records you are going to get more or less the same results as the big boys regardless of whether you do it by area-weighting, simple averaging or outlier-restricted kriging.

          Do you have a list of island stations? Here they are in order of appearance:

          Malden Is.
          Diego Garcia
          Christmas Is
          Isles Glorieuses
          Cocos Is
          St. Helena
          Easter Is.
          Juan Fernandez
          N. Amsterdam
          Gough Is.

          • William says:

            That’s because it’s wrong. Your curve is right. I blew it. That is good to know. I only checked it because your curve looked too clean when everything else I have plotted has been much more noisy. If it had also been noisy I might have believed that I really did have a selection bias against dropping temps. Like I said, categorization is difficult (you know that obviously, but readers might not) and avoiding bias doubly so.

            As for being pleased to be in the same ballpark, it pleased me because after all the fuss over the years from people claiming that there is no warming and that the indices are faked, it really should not be so easy to prove such arguments wrong.

  12. William says:

    The blue line is SH, as indicated by the legend at the top. It goes to > 1 whereas the original (plotted from your values directly here: http://s3.postimg.org/f7atixqhf/All.png) goes just above 0.5.

    A list of good/bad stations is now at http://pastebin.com/Vj7G5xRn

  13. Do you have a mean yearly temperature spread sheet for your total stations by year? In reading your and Williams discussion it seems to me both of you are missing the forest by looking at the trees. The missing data is not missing, it simply shows up in another record of another station as noise of unknown amplitude plus or minus. Take that comment of yours about the Sahara in 1790. That is noise but the temperature came from somewhere and that is real data. The lowering of stations simply lowers the reliability of the result in the short term it does not make it invalid. Hadcrut and the rest have a lot of noise as they and you try to make the impossible possible with perfect stations and placement. In theory you could measure the temperature of the earth from one station if it took a measurement every day or year or century or million years for 4 billion years. With longer times between measurements the short term results would be unreliable but the long term would be the same. The long term trend to us is a pretty short trend to the earth so you need more stations than one. If you have enough stations no matter if they are urban, rural or on land or near the sea the reliability of the resulting data goes up. Simply throw all the records in the batch and the odd records get buried by the others so the resulting noise with enough stations is pretty small for a global temperature even if the individual area is all over the place. What you wind up with is a time spread global temperature of varying reliability depending on stations in any one year. Increasing the spread to stations in say a ten year stretch would decrease the variability as you would be taking more measurements even with fewer stations. Statistical sampling uses that approach all the time, doing the global temperature should be similar.

Comments are closed.