Averaging Temperature Averages

I sent a link to my recent post The Hunt For Global Warming: Southern Hemisphere Summary to Professor Richard Muller at Berkeley drawing attention to the gulf between Berkeley Earth Surface Temperature (BEST) for southern hemisphere land and the compilations produced by Roger Andrews and I (Figure 1) in the hope that he or his group may help us to understand where the discrepancies may lie. He passed this on to Steven Mosher to respond and we exchanged several emails. Most of this correspondence shall remain confidential but suffice to say that Mr Mosher pointed out that they have verified and tested BEST and since neither I nor Roger had documented verification of our methods the onus was on us to do so.

In summary, Roger Andrews’ (RA) compilation for Southern Hemisphere land has good geographic cover and uses 369 records. My compilation (EM) to date uses 174 records and is specifically designed to sample low population areas. It nevertheless gave a result very similar to RA. Since 1882 BEST runs at about 0.7˚C per century warmer than either RA or EM (Figure 1). RA and EM are using GHCN V2 records that are “unadjusted”. A handful of spot checks in Australia show that these are the same records as unadjusted BEST. BEST is however using their own homogenisation algorithim to supposedly correct for non-climatic artefacts. Roger has compared BEST adjusted with unadjusted records in South America and found a large positive bias in the homogenised set (Figure 2).

Figure 1 Comparison of RA, EM and BEST. BEST series based on monthly data downloaded from their site and recalculated to the equivalent metANN employed by GHCN (DJFMAMJJASON). RA series begins in 1882 hence this is used as the start point. All three series adjusted so that 1882 = zero. There is a reasonably high degree of congruity between EM and BEST, i.e. the peaks and troughs rise and fall together. Note how similar the BEST 1976 feature is to EM. But the gradients are totally different. 1880 to 2011 EM = +0.18˚C per century; BEST = +0.91˚C per century.

In this post I report on three simple tests to the methodology I employed  to see how robust it is. This includes 1) summing the gradients of 5 individual records and 9 groups of records and comparing these with the gradient of the average of dT determined on the same record stacks; 2) filling blanks with average data for a region and comparing this with unfilled data; 3) decimating data by removing 20% and 50% of the records.

Figure 2 A comparison of BEST homogenised and unbiased records from S America. At the regional / continental level homogenisation is not supposed to introduce bias but in BEST it evidently does (chart by Roger Andrews, Worst of Best.)

Averaging Anomalies

One of the things I have been told is that no one is averaging anomalies. Looking at a chart of dT spaghetti both myself and Roger independently reached a conclusion that the easiest way to see or measure the average trend was to simply take the average of the dT stack on the spread sheet. But does the average dT gradient equal the average of the dT gradients measured for each station?

In Figure 3 I show dT for 5 S Hemisphere stations with long records. One from Australia, one from South Africa, two from South America and one from Base Orcadas, just off the Antarctic peninsula. Three have cooling trends and two warming trends. Base Orcadas in particular has one of the strongest warming trends in the S hemisphere.

Figure 3 Five long S Hemisphere records are shown together with the average of the dT stack. The gradient of the average dT stack is +0.37˚C per century. This compares with averaging the gradients of each individual record that works out at =0.34˚C per century.

Individually the gradients through each data series in ˚C per century are as follows:

Capetown: -0.46
Alice Springs: +0.73
Punta Arenas: -0.48
Asuncion Aero: -0.14
Base Orcadas: 2.06

Average = +0.34

This compares with a regression run through the mean of the dT stack =+0.37. The difference of 0.03˚C per century is immaterial.

Another way to look at this is to sum the dT gradients for each of the 9 regions looked at and to compare this with the sum of the dT stack for the 9 regions combined (Figure 4) and to compare that with the dT stack for the 174 records.

Sn Africa 1: +0.12
Sn Africa 2: -0.53
Central Australia: +0.14
Patagonia: +0.23
Antarctica +0.60
New Zealand: +0.15
Paraguay+: +0.01
Islands: +0.19
Peninsula: +1.6

Average: +0.28

This compares with the regression run through the sum of the dT stack that = +0.20 and a regression through the average of all 174 records = +0.18.

Figure 4 The traces of the averages of the 9 regions I have looked at so far are shown and the 9 averages are averaged. (Note that the two islands of Signy and Kerguelen are treated as a region). The sum of 9 averages = +0.28˚C per century. The regression through the average of averages is +0.20˚C per century. Summing the dT stack of all 174 records and running a regression through that produces +0.18˚C per century.

Note that while records within regions tend to be congruous the regions themselves are not. There can be a tendency for one region to be warm one year while another is cool.

This time there is a larger but still small difference of 0.08˚C per century between the sum of 9 dT gradients and a regression through the average of the 9. I suspect this may be linked to the discontinuous data series that affects in particular Antarctica and The Islands. In the average of 9 groups, Antarctica has equal weight to each group. In the sum of dT regression the weighting is significantly reduced since the data only begin in the 1950s. So let’s take a look at the impact of discontinuous data on the averaging process.

Averaging Discontinuous Data

Discontinuous data arises from station records starting and stopping at different times. Of my 174 records, only two run from 1880 to 2011. Those are Capetown and Alice Springs. All other records are incomplete. Out of a possible 22,794 annual average records (131*174) there are only 9,636 recordings. The record is only 42% complete. This most certainly creates potential issues prior to 1900 where the record may become biased by the relatively small number of stations. But what of all those blanks and data discontinuities? May that impart bias to a regression through the average of the dT stack?

To evaluate this I have filled blanks in all records with the average for the stack from each region. To put this in fancy language I have projected a regional mean into areas with no data. In reality I filled in blanks in my spread sheet with regional average values. The number of filled cells is now 21,473 or 94% complete (Figure 8). It is not 100% complete because certain regions do not have data spanning all the way back to 1880, hence there are periods in some records where there is no regional average with which to fill the blanks, for example Antarctica, The Peninsula and Patagonia.

Figure 5 This chart shows the regression through the sum of dT for the 174 records. The distribution of records is shown in Figure 7.

Figure 6 This chart shows the sum of the dT stack with blank records filled using the average values for each region to fill the blanks. The distribution of this synthesised record stack is shown in Figure 8. 

The difference between the empty cell and the filled cell regressions is o.o3˚C per century, effectively zero. The structure of the data has not surprisingly been changed. In particular, three high temperature spikes have been amplified. But the gradient is unchanged.

Figure 7 The annual count distribution of the 174 selected records. Only 42% of all possible annual records actually have data.

Figure 8 Filling blank records with regional means produces a synthetic data stack that is now 94% full. It is not 100% full because some regions do not have data in certain time periods with which to average and fill blanks.

Decimating Data

The final test is to see how sensitive the regression is to the number of stations included. The first removing 20% of records by deleting every 5th record and the second removing 50% of records, i.e. every second record which is quite a severe test with so few records at the outset.

Removing 1 in 5 records changes the gradient by +0.04˚C per century, which is immaterial (Figure 9).  Removing 1 in 2 records changes the gradient by -0.12˚C per century, a somewhat larger change (Figure 10). This latter test leaves only 3 records in the pre-1888 part of the stack which is far from representative, but it still makes little difference.

Figure 9 Deleting every 5th record produces little difference to the gradient of the dT stack.

Figure 10 Deleting every second record produces a small but recognisable change. This is a severe test, especially for the older part of the section which is reduced from 8 to 3 records.


In my original post I addressed two methodological issues :

  1. The normalisation procedure employed to produce the dT stack. Using the station average mean produced the same result as using a fixed base period of 1963 to 1992.
  2. Area weighting of regions did not significantly modify the results.

In this post I have applied some simple methodological tests. These tests may not apply to other data sets because the 174 records used here show little variance and are at the outset fairly homogenous. Put simply, the majority of these records are flat to gently rising (+0.2˚C per century) and no matter how they are sliced and diced that is the trend they should show (Figure 11).

The argument that my 174 selected records are non-represnentative is true but carries little weight since RA has a larger data set with more complete geographic cover of the S Hemisphere but produces a similar result . When I get around to expanding my data cover to more populated areas, many of which lie further north I fully expect to find marginally steeper warming gradients. The discussion then will focus on whether this has to do with greater human working of the land surface or with the greater land area per degree latitude as you move northwards.

The question remains how Berkeley (and GISS) manage to produce 0.9˚C warming out of records that appear on average to carry a +0.2˚C signal. Roger Andrews discusses some of the possibilities in Worst of Best. He speculates that leading contenders may be 1) homogeneity adjustments that may remove real climatic signal, 2) weighting of stations based on their match to a regional expectation, stations that do not match an expectation of warming may be de-weighted out of existence and 3) projection of data into areas with no data from up to 1000 km away. This latter possibility, if correct, opens the door to projecting Northern Hemisphere warming into the Southern Hemisphere.

In the interest of openness and transparency the 1.9Gb of Berkeley code is freely available for everyone to run and read. Does anyone out there have a functioning Berkeley Earth platform? Clive?

Figure 11 Summary of the dT gradients through different data sets using different methods for data treatment.

This entry was posted in Climate change and tagged , , , , , , , . Bookmark the permalink.

46 Responses to Averaging Temperature Averages

  1. bobski2014 says:

    When I did (pretty basic) statistics back in the early 70’s when the world was cool, I recall being told it is unwise and irresponsible to average averages. Has logic changed in the interim ?

    • Euan Mearns says:

      Yes and No. Your point is amply demonstrated by my figure 3 where no climatic significance should be attached to the gradient of 0.37˚C per century since this is heavily influenced by Base Orcadas that represents a small micro climatic regime but given 20% weight in this summation.

      Everyone uses averages. You start with Tmax and Tmin to give daily average which gives a false starting point. You then average days to a month and months to a year. And if you want to calculate the mean for a region you have little choice but to convert to anomalies and add the series together somehow.

      The point is to avoid artefacts that may skew results. I show here that averaging 9 groups gives similar result to averaging the 147 component records.

  2. mbe11 says:

    Your grafts show a clear cooling trend up to around 1975 with a slight warming after that which makes the trend line horse pucky as you have something resembling a partial sin wave track with a half cycle of around 100 years.

  3. mbe11 says:

    Forgot to reference the proposed cycles.

    87 years (70–100 years): Gleissberg cycle, named after Wolfgang Gleißberg, is thought to be an amplitude modulation of the 11-year Schwabe Cycle (Sonnett and Finney, 1990),[30] Braun, et al., (2005).[31]
    210 years: Suess cycle (a.k.a. “de Vries cycle”). Braun, et al., (2005).[31]

    • Euan Mearns says:

      Agreed that more sophisticated approach to curve fitting and analysing the distribution may be merited. That mid-1970s feature is for example rather strange but seen throughout many Australian and Sn African records. It is very real. Not so pronounced in Roger’s stack so I guess it is not present everywhere and gets diluted by adding more records and geography.

      Someone previously posted a link showing this was linked to higher rainfall. The point is that there must be a meteorological process that lies beneath this. One interpretation might be a step change in temperature across the feature.

      Pulling all this data together is incredibly time consuming but one of the aims is to provide regional perspectives and pieces of a jig saw that can be linked to real climatic processes. For example, 1943 ± a few years was extreme cold in western Europe but anomalously warm east of Urals. There has to be an explanation in atmospheric circulation patterns.

      • mbe11 says:

        Your article on sea level rise might be a clue to adjustment of the adjustments to the measured temperatures. Take midway island, out in the middle of nowhere, not on an active volcano or near a continental edge. The tide gauge has a trend of 1.19 mm a year but if you look at the actual data you have what appears to be a long term cycle that could be solar gain related with a delay factor as it takes the ice a while to melt and refreeze. The use of a simple trend line to what is a very short period of time used by both yourself and the AGW camp is in my opinion just bad science as it amplifies short term fluctuations and degrades longer term cycles even if a simple trend is short and easy to fit. If more long term records are available some other fit that fits better should be attempted. Of course my definition of short term is most likely a lot different than others as a hundred years is nothing to climate but a lot to a human. In any case your information is a tremendous douse of reality to the claims of AGW and from your requests to the BEST people a reflection of their lack of confidence in their own work in how they responded.

        • Euan Mearns says:

          This amply demonstrates your point. The regression here is clearly meaningless. In the N Hemisphere warming becomes a self fulfilling prophecy since the records begin late 1800s when it was cold. This is one group of congruent records that tell a story. But if you average this with another group the form of both groups is changed and becomes meaningless. What do you suggest?

          Not sure I’ve written an article on sea level rise?

          • mbe11 says:

            Sorry, what I meant is you and another poster where talking about climate change and sea levels came up.

  4. clivebest says:


    The Berkely source code is for Matlab which costs over £1000 for a single user license. Is this what Stephen Mosser calls ‘Open Source’ !

    I note that their ‘source data’ seems to include every known data set in existence, except of course most of these overlap with each other. For example they include all the station filles from HADLEY-CRU which themselves are a sub-set of GHCN2/3.

    I spent quite a bit of time studying the CRU station files a year or two ago. One interesting effect I found was a dependence on the way you do the geographic averaging of anomalies.

    The normal procedure followed by HAD-CRU is to calculate monthly averages for each month and each station between 1961 and 1990. These are then used to calculated anomalies for each station, and average them in each grid point. I decided instead to use the actual temperatures at each grid point resulting from the average of these stations. Then in a second step I calculate the monthly normals for each month at each grid point by averaging all the available data. There is no particular reason to take a fixed time period for the normals, since anomalies are just deviations from the norm. First I generated temperature grids from 1850 to 2010. I then used the grid monthly time series to derive normals per month for each grid point. Then we subtract the normal from grid temperatures to derive anomaly grids. Finally the area weighted and annual averages are derived. How does this compare with the standard result ?


    The blue curve is the new result. The trend prior to 1900 is totally different. There are systematic biases due to poor sampling. My conclusion is that you can’t believe anything prior to 1900. more here : http://clivebest.com/blog/?p=3153

    BEST probably do not do any better IMHO.

    • Euan Mearns says:

      Thanks Clive, here’s a couple of your nice graphics. I agree entirely that pre-1900 we have so few stations that the record prior to then is highly suspect. Looking at your map, I hadn’t fully appreciated just how poor cover was. By 1880 we must surely have a good deal more stations? It will take me a while to digest your methodology.

      If we could muster £1000 do you think we could learn something useful?

      • matthew_ says:

        Octave is open source and should, in theory, run their Matlab code. In practice there will probably be some things you have to change in order to get it to run. I’ve been using Octave as my Matlab subsitute for a few years and am happy with it.

      • clivebest says:


        Before investing a huge amount of time, we need to understand what we are aiming to do. Mathew is right we could use Octave or simply read the code to understand exactly what they have been up to. We could probably get it to all to work.

        As I see it there are the following questions.

        – Does the data homogenisation procedure introduce a warming bias?
        – Does their global averaging algorithm introduce a bias.
        – How do they define anomalies
        – Do anomalies themselves introduce a bias in time?
        – Is there an urban heating effect?

        I have looked at some of these points in the Hadcrut3/4 data. There are definite sampling biases in the global average temperaure. This is supposed to dissapear when you use anomalies instead. However there is an underlying assumption with temperature anomalies, which is that warming occurs everywhere uniformly.

        Suppose that the tropics expand a little but temperatures inside the tropics remain the same. The tropics are undersampled so we may not see that.

        • Euan Mearns says:

          My specific request to Berkeley was to ask them to compute an average S Hemisphere T using the “raw” records. I haven’t a clue whether that would take them 1 hour to set up and run or 1 week since I don’t know how this works. I was told to do it myself. So that would be my starting point. I want to find out if the Southern Hemisphere really is as immune to radiative forcing by CO2 as Roger’s and my analysis suggests. But using raw records alone may not be sufficient to test this if station weighting and data projection are also implicated.

          It would be incredibly useful to have a critical breakdown on BEST methodology from homogenisation, to station weighting to data projection. I have simply been told that they have ground truthed their method by running synthetic data through a GCM.

          • clivebest says:

            They must have results for the southern hemisphere alone which they could simply email you – although probably from their ‘homogenised’ data.You’re right the southern hemisphere has seen hardly any warming. eg.


            The bulk of the observed warming is in the Northern Hemisphere. This is probably due to the dominance of oceans in the SH and the much reduced seasonal swings.

          • euanmearns says:

            Yes, BEST have published data for N and S for homogenised data – I plot the S in my post, Fig 1. I want to see the S for “raw” data.

            I started to look at seasonal data in E Siberia and see that most of the warming does indeed lie in the winter months which has major consequence for concern over permafrost.

          • Hugh says:

            most of the warming does indeed lie in the winter months which has major consequence for concern over permafrost.

            The temperature of winter air, ~-40°C, has little to do with the permafrost in a case there is a thick layer of dry snow on top of the ground. I believe the difference is made at the autumn and the spring. If the snow cames after cold spells, the permafrost may grow. If the snow falls on warm ground, the winter does not form a good frost.

            You can improve the frost by walking / driving on snow, so we could send an army of Green, politically green, men to Siberia to save the permafrost by walking there. 🙂

          • Euan Mearns says:

            Hugh, to be clear, I meant that a couple of degrees warming in winter will make no difference to the permafrost and that summer temperatures are less affected. Seasonal shifts in warming distribution and timing of snow fall etc will have only marginal impact.

    • mbe11 says:

      Why not simply leave out the grid points, a source of bias, area weighted , a source of bias, altitude adjustments, a source of bias, heat island effects, a source of bias, anomalies are also a source of bias as the reference years are entirely biased, what you have than is a world wide average temperature which may be high or low but has no man induced bias outside of the instrument and measurement errors. You could use the various studies of plankton proxies to than adjust the temperatures with the fewest possible man induced bias points. This would not work after about 1990 or so as the cores have problems but before that you have a bunch of proxies from all over the world that go down to the year to adjust all your earlier temperatures.

    • Roger Andrews says:

      My conclusion is that you can’t believe anything prior to 1900.

      Mine too.

  5. A C Osborn says:

    Eaun, in the past I have referenced a couple of Graphs by Zeke from a thread over at WUWT from last year when they were having a go at Steven Goddard fro using simply averaged Absolute Temperatures rather than gridded anomalies.
    But there was one graph I did not hightlight at the time.
    Please take a look and possibly put them up for others to see, they show how what must be Climatic Features which are totally lost after using anomalies and gridding.

    The first is RAW USHCN USA both absolute and gridded.
    The Climate Features are not changed too much.

    The second is GHCN Global RAW Absolutes.

    The third which I have not pointed to before is the same GHCN data but using Anomalies and gridding.

    Just compare the 2 GHCN graphs fro how much the data presentation is changed.

    The actual thread for anyone interested is here

    • Euan Mearns says:

      AC, thanks for links. I think it is absolutely essential to use anomalies, otherwise the structure off the discontinuous data series will take over. I think that gridding and area weighting are also sound ideas, but it then it depends on how this is done.

      I like the proposal that Earth temperature can be soundly modelled using 50 long continuous and reliable records. Roger has a model for climatic / temperature zones / areas that I hope he finds time to publish one day. That could be a starting point. One record from each zone weighted to the area of that zone.

      Trouble starts with too much processing. Projecting data into a grid block from outside of a zone will produce distortions. Homogenisation produces distortions, and weighting stations according to how well they match an expectation may produce distortions.

      One has to accept that pre-1900 there is a paucity of decent data. And that large parts of the surface have no data. We do not know what the temperature evolution was on Antarctica pre-1950.

      • I like the proposal that Earth temperature can be soundly modelled using 50 long continuous and reliable records. Roger has a model for climatic / temperature zones / areas that I hope he finds time to publish one day. That could be a starting point. One record from each zone weighted to the area of that zone.

        That would indeed make an interesting post, but as a practical matter I’ve found that it makes no difference how you area-weight the raw records, or indeed whether you area-weight them at all.

        The graphic below shows three of my reconstructions of global surface air temperatures since 1890.

        The red line is the “verified” series, where I selected 696 GHCN records that were congruous (and rejected a lot that weren’t) segregated them into 64 separate climate zones where temperature trends were different to those in surrounding areas, constructed a separate series for each zone and weighted the results by the area of the zone.

        The green line is the “unweighted” (“RA”) series. To construct it I added 204 of the rejected records to the 696 verified records and calculated an unweighted mean for these 900 records using a first difference approach.

        The blue line is the “latitude weighted” series, where I segregated the results of the unweighted series into latitude zones and weighted them by the cosine of latitude, so that records in the tropics will have a much greater impact than records in the polar regions.

        And all three series gave the same results. I haven’t drawn trend lines but they run almost exactly parallel. For what it’s worth they show 0.6C of warming since 1890.

        • A C Osborn says:

          “For what it’s worth they show 0.6C of warming since 1890. ”

          Shouldn’t that be 0.9C?

        • Euan Mearns says:

          So we need 64 and not 50 long continuous records, but I bet we would be lucky to have 32, i.e. 50% hit rate in your 64 zones. Your chart has that very characteristic double down cooling in the mid-70s I’m guessing inherited from Southern Hemisphere. In terms of ocean-atmosphere circulation I’d really like to know what caused that. What followed was different.

  6. Pingback: The Hunt For Global Warming: Southern Hemisphere Summary | Energy Matters

  7. Jeff says:

    You are not really applying tests in the way I understand SM to mean. To test a method of homogenization, you need to create a set of test station data with a known combined average trend and then modify that data by, for example adding random noise, adding steps up or down to represent changes in station position or changes to instrumentation, add biases to represent changes to the time of observation, etc. Then apply your method and see whether what you get matches the original known trend.

    Your problem in doing this is that you have no homogenization method. You assume that just adding up the data gives you the real trend ant that if it doesn’t match what BEST/GHCN etc calculate that they are wrong. But they do have a method, good or bad, and they have done such tests. What you are assuming is, in effect, that none of the non-climate components of the “raw” data matters – that the trend should show through however badly the climate signal has been corrupted. Create a test data set and try your averaging – my bet is that it will be immediately evident that this is not true.

    • Euan Mearns says:

      Jeff, I’m off to bed. So brief reply:


      Globally however, the effect of adjustments is minor. It’s minor because on average the biases that require adjustments mostly cancel each other out.

      I will find you same from GISS tomorrow.

    • Figure 2 is a reasonably robust test of BEST’s homogenization in S America. It manufactures about a degree of warming.

      • Jeff says:

        It is not a test, the graph just plots the results of using a method that has been tested on a set of inhomogeneous data. And the result is not “manufactured” warming but the warming inherent to that data – assuming that prior testing showed the method capable of resolving real trends from synthetic data (which I assume to be true, but have no knowledge of). You only need to look at the “raw” data for the example you used (Mariscal) to see that the data is not homogeneous.

        Whether just averaging inhomogeneous data can give a homogeneous result depends upon the data. If you have enough of it (say, data for the whole world) it might well – and indeed it seems to, according to Euan’s quotes. If you have insufficient data, say a local handful of stations, it almost certainly does not. So clearly progressing from the local to global level, simple averaging gets better – whether progressively or suddenly presumably depends on the data. Hence I see nothing wrong in principle with BEST having a substantial effect locally but not globally.

        You could clearly “test” your averaging with synthetic data if you wanted to. But it would not be worthwhile. Whereas the BEST method can reasonably be expected (if it works) to find the true trend whatever choice of artificial biases (station moves, TObs changes etc) you add to the synthetic data, averaging will only yield the true trend if you purposefully balance the biases (as many up as down). That in a nutshell is why averaging on much less than global scale (where biases do apparently cancel) is likely to be unreliable. So to take the results of averaging seriously you must make the implicit assumption that there is enough data that non-climate components cancel out over the area you are considering. Although that is not impossible, it seems unknowable and unwise to assume.

    • euanmearns says:

      Jeff, the point of this post is not to test homogenisation but the simple normalisation and averaging procedures I have used to create Figure 5 and seek to understand why the gradient on that chart is totally different to BEST (Figure 1). I have no issue with meteorologists correcting their data for identifiable and measurable non-climate effects. But automated homogenisation disfigures data en mass going way beyond any conceivable argument that it is non-climatic artefacts that are being removed. In the link I posted last night BEST acknowledge that the sum of non-climatic artefacts is close to zero. GISS also acknowledge this:

      Rather, these differences are dominated by the inclusion of appropriate homogeneity corrections for non-climatic discontinuities made in GHCN v3.2 which span a range of negative and positive values depending on the regional analysis. The impact of all the adjustments can be substantial for some stations and regions, but is small in the global means.


      So if non-climatic artefacts sum to zero why correct for them and disfigure all your data in the process and open the door for potential regional biases?

  8. Euan Mearns says:

    I have found a data example that proves my methodology to be wrong 😉

    Tennant Creek in Australia has two GHCN V2 records with 1 year overlap. One with subscript “mo” I assume to mean moved – they have the same GR.

    The gradient through Tennant original is +0.91˚C per cent and Tennant mo is +0.40˚C per cent. The average is clearly +0.66˚C per cent (BEST have +0.86). However, my method of adding records together before determining the gradient yields +0.17˚C per century.

    Silly me 🙁

    PS the third chart simply splices the two records together, you can’t see the join.

    • Euan:

      If you calculated anomalies for 1926-1970 by subtracting the mean temperature for 1926-1970 (25.49C) and calculated anomalies for 1970-2011 by subtracting the mean temperature for 1970-2011 (25.76C), then you would have manufactured an artificial downward shift of 0.27C in the record in 1970, as shown in the graphic:

      I don’t know how much difference this would make to your 174-record series though, maybe not all that much (I think you ran a check in one of your earlier posts?) But it is a good example of why it’s desirable to use the same baseline to calculate anomalies for all records.

      Incidentally. I believe “mo” is the Met Office station at the Tennant Creek airport. The original station was downtown at the Post Office.

    • Euan Mearns says:

      Roger, I agree entirely that using a fixed base period is superior. In this case adding the two separate dT records together produces a spurious result as you show. I think the chart above is doing this the correct way – I’ve joined the record first before calculating dT. This one uses the base period 1926 to 2011. Using 1963 to 1992 in this case gives the exact same result. This still leaves the issue that BEST have 0.90˚C per century for Tennant Creek and I now have about 12 charts 🙁

      I’ve tested station base against fixed base often enough now to be satisfied it is not introducing material bias. I imagine that for every instance of positive bias there is one of negative bias. You may recall that when I reduced all my data I used both methods and the fixed base gave slightly lower gradient which Idid not expect.

      I got on to this by looking at the BEST Australian records the same way you looked at S America and note that they are driving regressions through many short records and getting warming trends often over 2˚C per century. Do you know how they combine the individual records to get a regional and global mean? In Australia they are cooling most records.

  9. Euan Mearns says:

    Tennant Creek has 2 GHCN V2 records. One that runs from 1926 to 1970. The other (Tennant Creek MO) runs from 1970 to 2011. BEST as only a single record that runs from 1911 to 2013.

    Figure 1 The two segments of the GHCN V2 records. The year of overlap (1970) the T difference is 0.07˚C. Note this chart is plotting T with annual averages that vary from 24 to 27 ˚C. Regressions through the individual segments show +0.91 and + 0.40˚C per century.

    Figure 2 I have joined the records by taking the mean. A regression through all the data yields +0.52˚C per century.

    Figure 3 Converting to dT using the station average as the base yields the exact same gradient as the regression through the temperature stack.

    Figure 4 Converting to dT using the average of 1963 to 1992 as the base yields the exact same gradient as in Figures 2 and 3.

    BEST report +0.9˚C per century. How do they get there?

    Figure 5 Plotting the 1926 to 2011 segment of BEST raw data to be comparable with GHCN produces a gradient of 0.73˚C per century. The GHCN V2 and BEST raw are not exactly the same in this case.

    Figure 6 Plotting the 1926 to 2011 segment of BEST adjusted for the same time interval produces 1.01˚C per century.

    Looking at all the preceding charts I would judge that the trends are by and large flat with no clear sign of rising tops and bottoms. And ist very difficult to see differences between them. So this comes down to subtle distributions within range bound data. BEST adjusted produces twice the warming as GHCN V2. I’m beginning to wonder if this is not a wild goose chase 🙁

    Figure 7 I want to check the XL regression against BEST. So this chart plots the full 1911 to 2013 BEST raw data and yields 0.83˚C per century. This compares with 0.86˚C quoted by BEST. That is close enough for me. There are blank months in the data and a small difference may be accounted for by how those are managed. I filled most but not all blank months with data from the prior year.

    Figure 8 Checking the regression again, this time with BEST adjusted produces +0.92˚C per century. This compares with what BEST quote to be +0.90˚C per century. Again, this is good enough for me. Note how 1914 was the warmest year.

    Figure 9 This chart shows the difference between GHCN V2 and BEST raw. The 1926 to 1970 part is quite similar through not identical. But there is clearly a problem post 1993. I suspect the issue may be with the GHCN V2 records that cease to be raw after that date. WHAT A MESS! The gradient through this dT is +0.2˚C per century that accounts exactly for the difference between GHCN V2 and BEST raw 1926 to 2011.

    Figure 10 Finally subtracting BEST adjusted from BEST raw provides this picture that matches the scalpel slashes made to the data by BEST and accounts for the difference between the raw and adjusted gradients.


    So where does the truth lie? I suspect BEST raw (Figures 5 and 7) yield the most likely answer for now.

    • BEST report +0.9˚C per century. How do they get there?

      Euan, you have stumbled across the curious case of the revised records. Explanatory spreadsheet on its way.

  10. A C Osborn says:

    Someone else is going to be asking the same questions.

    With so many different people finding probelms with the datasets over the last 10 years or so GWPF have put together a group of different disciplines to look into it.

  11. firstdano says:

    It’ll matter if you publish, otherwise useless. Let us know when you publish.



  12. Pingback: AWED Energy & Environmental Newsletter: May 11, 2015 - Master Resource

  13. Pingback: Recent Energy And Environmental News – May 11th 2015 | PA Pundits - International

Comments are closed.