This post follows up my recent Paraguayan temperature puzzle post on homogenization and temperatures in central South America. In it I offer some insights into specifically how the Berkeley Earth Surface Temperature (BEST) adjustment procedures contrive to show warming across the whole of South America while the BEST raw records show a mixture of warming and cooling trends.
Figure 1 of the Paraguay post showed a map of warming and cooling gradients since ~1950 at selected South American stations based on linear trends measured from GHCN v2 raw records. Figure 1 below shows the BEST version of this map based on linear trends measured from BEST’s raw records (all the data used in this post are downloaded from BEST’s station data site). It’s not directly comparable to the earlier Paraguay map because BEST measures warming from the beginning of the record rather than after 1950, but we still see the same concentration of blue dots in Paraguay and in parts of Chile. Overall two-thirds of the BEST raw records show warming and one-third show cooling:
Figure 1: Temperature gradients measured from BEST raw records at selected South American stations.
Figure 2 now shows what the map looks like after BEST adjusts the raw data. There is now no record in South America that shows cooling. All of them show variable amounts of warming.
Figure 2: Temperature gradients at selected South America stations after BEST homogeneity adjustments
These two graphics alone are enough to suggest that BEST’s adjustments to the South American records are warming-biased, but before we can confirm this we need to tie up the loose ends. To do this we will look first at a specific example that demonstrates a warming bias in BEST’s adjustment procedures, and then we will follow up with a discussion of how BEST’s procedures might have generated this bias.
There are far too many records in South America for me to deal with all at once, so again I will use Paraguay as an empirical example of the way BEST’s adjustment procedures work. Figure 3 shows BEST’s raw monthly record for Mariscal Estigarribia (henceforth Mariscal), a record in northwest Paraguay:
Figure 3: Mariscal raw record (reproduced from the BEST station data site)
The first steps in the adjustment process are to remove the obviously bogus readings (BEST does a reasonably good job of this) and, more importantly, to identify and quantify artificial shifts in the raw records. BEST identifies these shifts by comparing the raw records with a “temperature expectation” series, and comparing BEST’s Mariscal raw record with BEST’s Mariscal temperature expectation series identifies the four shifts shown by the black lines in Figure 4:
Figure 4: Shifts in the Mariscal raw record identified by BEST (reproduced from the BEST station data site)
Between them these four shifts define a net cooling bias of around 0.5C in the Mariscal record. (Note that all four are classified as “empirical breaks”; there is no record of any coincident station moves or time-of-observation changes.) And when the shifts are adjusted out the Mariscal series becomes, unsurprisingly, a close match to the Mariscal temperature expectation series:
Figure 5: Comparison of adjusted BEST Mariscal record with BEST temperature expectation series (reproduced from the BEST station data site)
But where did BEST get the temperature expectation series from? According to Rohde et al.‘s 2013 description of BEST’s temperature averaging procedures it obtained it from weather stations in the “local region” surrounding Mariscal:
… we have provided a “regional expectation” time series, based on the Berkeley Earth expected temperatures in the neighborhood of the station. This incorporates information from as many weather stations as are available for the local region surrounding this location
So let us now look at the weather stations in the local region surrounding Mariscal, which I have arbitrarily defined to be within a circle with a radius of 500km centered on Mariscal. Inside this circle there are 13 reasonably well-distributed records – Nueva Asuncion, Puerto Casado, Las Lomitas, Yacuiba, Bahia Negra, Rivadavia, Concepcion, Camiri, Robore, Tarija, Puerto Suarez, Corumba and Asuncion. A simple average of these records yields the“local” series shown in Figure 6. The trend line shows about two-tenths of a degree of cooling since 1950:
Figure 6: Mariscal “local” time series, constructed by averaging 13 records within 500km of Mariscal
Contrast this now with the BEST temperature expectation series for Mariscal. There’s a good peak-trough match, but the expectation series shows about a degree C more overall warming than the local series:
Figure 7: Mariscal local time series compared with BEST’s temperature expectation series for Mariscal
Where does this added warming come from? I can replicate it only by discarding all the local records that show cooling. The average of the remaining four records that show warming (Las Lomitas, Robore, Puerto Suarez and Corumba) is a respectably close match to the BEST expectation series:
Figure 8: Mariscal local time series with cooling records discarded compared with BEST’s temperature expectation series for Mariscal
And the match gets closer as we add more warming stations from outside the 500km “local” radius (Figure 9 adds the records from Corrientes, Resistencia, Formosa, Posadas, Ponta Pora and Pedro Juan Caballero):
Figure 9: Mariscal local time series with cooling records discarded and more distant warming records added compared with BEST’s temperature expectation series for Mariscal
Clearly BEST has based its temperature expectation series for Mariscal largely if not entirely on surrounding records that show warming. Records that show cooling are ignored or at least heavily de-weighted. How does BEST do this?
Now I don’t suppose for a moment that BEST went through the records one by one and deliberately threw out those that showed cooling. The culprit has to be BEST’s adjustment procedures, which sausage-machine large numbers of records into a homogeneous whole. (And the whole is very homogeneous. The BEST adjusted series for Paraguay is hardly distinguishable from the BEST adjusted series for Brazil and Argentina.) The question therefore becomes, exactly how do BEST’s adjustment procedures do this? Here is what I believe to be the sequence, although a lot of back-and-forthing goes on:
BEST assesses record reliability (the quotes are again from Rohde et al.):
… we assess the overall “reliability” of the record by measuring each record’s average level of agreement with the expected field at the same location.
Here’s our first indicator. The records around Mariscal that show warming agree with the Mariscal “expected field” (i.e. the temperature expectation series) and those that show cooling don’t. Hence the warming records will receive a higher reliability ranking than the cooling records.
BEST then uses the reliability rankings to weight individual records:
Another problem is unreliability of stations …… To reduce the effects of such stations, we apply an iterative weighting procedure.
How much difference does this weighting make?
The (reliability metric) is used as an additional deweighting factor for each station …. this metric has a range between 2 and 1/13, effectively allowing a “perfect” station to receive up to 26 times the score of a “terrible” station.
De-weighting the records around Mariscal that show cooling by factors of up to 26 would certainly explain why they disappear during the adjustment process.
And as noted in the second quote final weights are assigned by an iterative weighting procedure:
The determination of the weighting factors is accomplished via an iterative process that seeks convergence. The iterative process generally requires between 10 and 60 iterations to reach the chosen convergence threshold of having no changes greater than 0.001°C in Tavg between consecutive iterations.
This iterative process is probably where the damage is done. Exactly how it’s done isn’t clear, but it may have to do with the fact that the iterations use every record within a 2,000 km radius (up to 300 are used to construct BEST’s final Paraguay series), and since about two-thirds of these records will show warming and only one-third cooling we might expect that the iteration process will progressively de-weight the cooling stations and converge on the warming stations.
Another possible contributor is BEST’s “scalpel” approach, which divides individual raw records into separate records whenever a “break” in the record is identified. The problem here is whether the breaks BEST identifies are artificial. The downturns in the Paraguayan records that define the main period of cooling in the 1960s and 1970s are all identified as artificial breaks even when no break in the raw record is visible, which is most of the time.
But regardless of its precise origin these results leave little doubt that BEST’s adjustment procedures have introduced artificial warming biases into the raw temperature records in and around Paraguay. (Yes, I know that in the Puzzle of Paraguay post I concluded that there really wasn’t enough information to say whether the Paraguayan records were cooling-biased or not, but the BEST adjustments would be the same in either case. They take no account of the potential impacts of international borders. The problem is structural and internal to BEST’s procedures.)
The warming bias is also continent-wide (and probably hemisphere-wide too, although I have not verified this). My tabulation of BEST’s raw and adjusted results for 86 South American stations is a little too large to fit in the post, but here’s a summary of it. (Note that the warming trends are calculated over the total length of the record):
BEST’s adjustments have lowered the range of warming/cooling observed in the raw records from 6.50°C/century to 2.95°C/century, showing that they have indeed gone some way towards homogenizing the data, but in doing so they have also roughly doubled the amount of warming shown by the raw records. And after adjustment every single record shows some level of overall warming. Not one shows a cooling trend.
The warming bias introduced by the BEST adjustments is illustrated graphically in Figure 10, which plots the trend-line gradients of each of the 86 raw records against the adjustment applied to them. The red line shows what an unbiased homogenization operator passing through 0.0 would look like. The trend line through the BEST adjustments shows an average warming bias of about a degree C relative to this line. (The two outlier records that show near-zero warming and zero adjustment are Bariloche and Bahia Blanca in Argentina. How these records managed to escape unscathed is not known.)
Figure 10: XY plot of raw record trend line gradients for 86 South American stations versus adjustments applied by BEST
Before closing I will briefly touch on two other problems with BEST’s adjustments. The first is that they homogenize the records so strongly that any variations in regional trends that might be present are frequently obliterated. As a result the BEST series are of limited use in studying regional trends.
The second is illustrated in Figure 11, which compares the raw record for the Amazonian city of Manaus with BEST’s adjusted series for the 1.6-degree grid block that Manaus is located in. There are some wild divergences between the raw record and BEST’s grid block series before 1940 but at least they both show similar amounts of warming after then.
Figure 11: Manaus raw record versus adjusted BEST series for the Mariscal 1.6 degree grid block
The main feature of interest, however, is that Manaus record goes back only as far as 1910 while the Manaus grid block record goes back to 1824. Where does the extra 86 years of data come from? From far away. According to BEST all of the data between 1886 and 1910 were projected into the grid block from stations more than 1000 km from Manaus and almost all of the data before 1886 were projected in from three stations on islands in the Caribbean and Atlantic – St. Clair Trinidad, Codrington Barbados and Saint Vincent. Do we get meaningful results when we project temperatures from these three island records almost 2000 km south into the middle of the Amazon jungle? I suspect not.
Finally comes the question of the quality of the raw records BEST uses to extend temperatures in the Manaus grid block back before 1910. Figure 12 plots some of them up. Words are superfluous.
Figure 12: Raw records used to extend the Manaus grid block record back before 1910.