Intercomparison of Present and Future Climates Simulated by Coupled Ocean-Atmosphere GCMs

PCMDI Report No. 66

Curt Covey*, Krishna M. AchutaRao*, Steven J. Lambert+, and Karl E. Taylor*
October 2000

*Program for Climate Model Diagnosis and Intercomparison, Lawrence Livermore National Laboratory, Livermore, California
+Canadian Centre for Climate Modeling and Applications, Victoria, Canada

Corresponding author address: Dr. Curt Covey, Program for Climate Model Diagnosis and Intercomparison, Mail Code L-264, Lawrence Livermore National Laboratory, Livermore, CA 94550


We present an overview of results from the most recent phase of the Coupled Model Intercomparison Project (CMIP). This phase of CMIP has archived output from both unforced ("control run") and perturbed (1% per year increasing atmospheric carbon dioxide) simulations by 15 modern coupled ocean-atmosphere general circulation models. The models are about equally divided between those employing and those not employing ad hoc flux adjustments at the ocean-atmosphere interface. The new generation of non-flux-adjusted control runs are nearly as stable and agree with observations nearly as well as the flux-adjusted models. This development represents significant progress in the state of the art of climate modeling since the Second (1995) Scientific Assessment Report of the Intergovernmental Panel on Climate Change (IPCC; see Gates et al. 1996). From the increasing-CO2 runs, we find that differences between different models, while substantial, are not as great as would be expected from earlier assessments that relied on equilibrium climate sensitivity.

1. Introduction

This report summarizes several aspects of results from Phase 2 of the Coupled Model Intercomparison Project (CMIP2; see Meehl et al. 2000). CMIP was established in 1995 by the JSC/CLIVAR Working Group on Coupled Models, a part of the World Climate Research Program. CMIP may be regarded as an analog of the Atmospheric Model Intercomparison Program (AMIP; see Gates et al. 1999) for coupled atmosphere - ocean - sea ice general circulation models (coupled GCMs). These models simulate the physical climate system given external boundary conditions such as solar luminosity and concentrations of radiatively active gases and aerosols, in contrast to atmosphere-only GCMs (AMIP-type models) in which sea ice and sea surface temperature are prescribed to match observations. Coupled GCMs are the primary models used to assess possible global warming due to the anthropogenic greenhouse effect.

The first phase of CMIP (CMIP1) collected output from coupled GCM "control runs" in which external climatic forcings such as greenhouse gas concentrations are kept constant. Nearly all output fields were archived as seasonal (December-January-February and June-July-August) climatological means. The next phase, CMIP2, collected output from both model control runs and matching runs in which atmospheric carbon dioxide increases at the rate of 1% per year. Under this common scenario of radiative forcing, any differences among the models are due to differences in their responsiveness, e.g., their differing equilibrium climate sensitivities and rates of ocean heat uptake, which in turn arise from differences in resolution, other numerical aspects, and parameterizations of sub-gridscale processes. CMIP2 thus facilities the study of intrinsic model differences at the price of idealizing the forcing scenario. No other anthropogenic climate forcing factors, such as anthropogenic aerosols (which have a net cooling effect), are included. Neither the control runs nor the increasing-CO2 runs in CMIP include natural varations in climate forcing, e.g., from volcanic eruptions or changing solar brightness. Each CMIP2 model simulation extends for 80 years. All CMIP2 model output variables were archived as four nonoverlapping 20-year means with the exception of surface air temperature, mean sea level pressure and precipitation, for which all 960 monthly means were archived. Details of the CMIP database, together with access information, may be found on the CMIP Web site at

The purpose of this report is to give an overview of the CMIP2 simulations with emphasis on common model successes and failures in simulating the present day climate, and on common features of the simulated changes due to increasing CO2. We pay extra attention to the 3 fields that CMIP provides at monthly mean time resolution. The other fields, lacking information about seasonal variations, are necessarily described in terms of annual mean quantities. Extensive analyses of seasonal variations in the CMIP1 control runs is given by Covey et al. (2000) and Lambert and Boer (2000), and more specialized studies of the CMIP database are summarized by Meehl et al. (2000) and the CMIP Web site at

In this report we include 15 models from CMIP2: BMRC, CCCMA (CGCM1), CCSR, CERFACS (ARPEGE/OPA2), CSIRO (Mk2), DOE PCM, ECHAM3+LSG, GFDL (R15a), GISS (Russell), HadCM2, HadCM3, IAP/LASG, LMD/IPSL, MRI (Tokioka) and NCAR CSM. (Documentation of these models is available on the CMIP Web site.) We have not included the CMIP2 contributions from NRL or the older NCAR Washington and Meehl model. NRL's control run is only 3 years long, and Washington and Meehl regard their older model as superceded by the DOE PCM. On some figures we have also included a model that has not yet joined CMIP2: ECHAM4+OPYC3. This model has not yet been forced in a 1% per year CO2 increase scenario. We include its control run output in some of our figures at the request of the Intergovernmental Panel on Climate Change (IPCC), which will include this model's simulations of global warming in its upcoming Third Scientific Assessment Report. Some of the figures below may also appear in the IPCC Report.

The rate of radiative forcing increase implied by 1% per year increasing CO2 is nearly a factor of two greater than the actual anthropogenic forcing in recent decades, even if non-CO2 greenhouse gases are added in as part of an "equivalent CO2 forcing" and anthropogenic aerosols are ignored (see, e.g., Figure 3 of Hansen et al. 1997). Thus the CMIP2 increasing-CO2 scenario cannot be considered as realistic for purposes of comparing model-predicted and observed climate changes during the past century. It is also not a good estimate of future anthropogenic climate forcing, except perhaps as an extreme case in which the world accelerates its consumption of fossil fuels while reducing its production of anthropogenic aerosols. Nevertheless, this idealized scenario generates an easily discernible response in all the CMIP2 models and thus provides the opportunity to compare and possibly explain different responses arising from different model formulations.

2. Present-day climate

In this section we compare output from the model control run simulations with recent climate observations. It has become increasingly apparent that the detailed climate record of the past century (and indeed the past millenium) cannot be explained without considering changes in both natural and anthropogenic forcing (Tett et al. 1999; Santer et al. 2000; Crowley 2000). Since the CMIP control run boundary conditions lack these forcing variations, we focus on means and other statistics that we judge to be largely unaffected by them. In particular we do not discuss the climate variability simulated by the CMIP control runs. This topic has been addressed in more specialized studies (Barnett 1999; Bell et al. 2000; Duffy et al. 2000).

For our observational data base we use the most recent and reliable sources we are aware of, including Jones et al. (1999) for surface air temperature, Xie and Arkin (1997) for precipitation, and reanalysis of numerical weather prediction initial conditions for sea level pressure. We sometimes use multiple sources to provide a sense of observational uncertainty, e.g., reanalysis from both the European Center for Medium-Range Weather Forecasts (ERA15; Gibson et al. 1997) and the U.S. National Centers for Environmental Prediction (NCEP; Kalnay et al. 1996).

a. Global and annual means

Averaging over latitude and longitude to form global means reduces surface variables to one-dimensional time series. Additional averaging of monthly means to form annual means removes seasonal cycle variations (which can be substantial even for global means), providing a convenient entry point to three-dimensional model output. Figure 1 shows the resulting time series for CMIP2 control run surface air temperature and precipitation.

The range among the models of global- and annual-mean surface air temperature is rather surprising at first sight. Jones et al. (1999) conclude that the average value for 1961-1990 was 14.0°C and point out that this value differs from earlier estimates by only 0.1°C. It therefore seems that several of the models (which simulate values from less than 12°C to over 16°C) are in significant disagreement with the observations of this fundamental quantity. Reasons for this situation are discussed briefly by Covey et al. (2000) in the context of the CMIP1 models. The models as a group also give a wide range of estimates for global- and annual-mean precipitation, compared with the best observed values from several sources (2.66-2.82 mm / day from Table 2 in Xie and Arkin 1997). Precipitation, however, is notoriously difficult to measure globally, and the observational uncertainty of its global and annual mean may not be smaller than the range of model-simulated values in Figure 1.

Perhaps the most striking aspect of Figure 1 is the stability of model-simulated temperature and precipitation. The stability occurs despite the fact that 6 of the 15 CMIP2 models refrain from employing ad hoc flux adjustments at the air-sea interface. Until a few years ago, conventional wisdom held that in order to suppress unrealistic climate drift, coupled ocean-atmosphere general circulation models must add such unphysical flux "corrections" to their governing equations. The 1995 IPCC assessment (Gates et al. 1996) diplomatically expressed the concern that "[f]lux adjustments are relatively large in the models that use them, but their absence affects the realism of the control climate and the the associated feedback processes". The CMIP1 experiments were conducted at about the same time as this assessment was written. Covey et al. (2000) note that averaging the magnitudes of linear trends of global- and annual-mean surface air temperature gives 0.24 and 1.1 °C / century, respectively, for flux-adjusted and non-flux-adjusted CMIP1 models. For the CMIP2 models shown in Figure 1, however, the corresponding numbers for the average ± 1 standard deviation over each class of model are 0.12 ± 0.14 °C / century for the flux-adjusted models and 0.31 ± 0.31 °C / century for the non-flux-adjusted models. Nevertheless, it must be kept in mind that a small rate of global mean climate drift does not preclude strong local drifts at the surface and problematic long term drift in the deep ocean.

b. Long-term time means

Atmosphere: geographically distributed quantities

As noted above, most of the CMIP2 output variables are present in the database as 20-year means that average out the seasonal cycle. In this subsection we examine surface variables and other two-dimensional (latitude-longitude) quantities. To summarize the performance of the models, we interpolate their output to a common Gaussian grid with 128 longitudes and 64 latitudes and show both the model mean (the average over all the models) and the intermodel standard deviation (sdm). Where possible, we compare the model means for the control simulations with observations. Lambert and Boer (2000) demonstrate that the model mean exhibits good agreement with observations, often better than any of the individual models. High values of sdm indicate areas where the models have difficulty in reaching a consensus, implying reduced levels of confidence in the model results.

Results for which observations are available are presented as four-panel displays. The upper-left panel shows the model mean and sdm, the lower-left panel shows the observed field and the departure of the model mean from this observed field, and the lower-right panel shows zonal averages for the individual models and the observations. These three panels contain only output from model control runs. The upper-right panel gives the differences between the model mean for years 61-80 and years 1-20 for the enhanced greenhouse warming simulations, together with these differences normalized by their standard deviation among the models. Results in the upper-right panels will be discussed in Section 3.

Figure 2 displays results for annual mean surface air temperature (also known as screen temperature). Over most of the globe, the model mean differs from the Jones observations by less than two K, although larger differences are evident in polar regions. These annual departures are much less than the winter and summer season errors reported by Lambert and Boer (2000). The zonally averaged results for the individual models show that all are quite successful in reproducing the observed structure, except in the polar regions. sdm values show that the models tend to disagree in the polar regions and over high terrain but produce consistent simulations over ice-free oceans. This consistency may occur because the ocean components of coupled models tend to be more similar than their atmospheric components, or it may simply be due to the lack of terrain effects and strong horizontal gradients over open oceans.

Figure 3 displays results for annual mean sea level pressure. As demonstrated by sdm, the models are very consistent in their simulations. The largest variances occur in south polar regions and much of this results from extrapolation below ground. Comparison with the ECMWF/ERA reanalysis (Gibson et al. 1997) shows that the model mean is within 2 hPa of the observed field over most of the globe. The largest departures occur near Antarctica with lesser departures north of Scandinavia, Russia and western North America. The zonally averaged results demonstrate the agreement among the models. With the exception of one model and in the southern polar regions, the models agree with each other to within ~5 hPa. Also evident from the zonally averaged results, however, is the difficulty that models have in simulating both the position and depth of the Antarctic trough.

Figure 4 displays results for annual mean precipitation. It is evident from the relatively large sdm that the models have difficulty in producing consistent simulations. This result is expected because precipitation is a small scale process. Likely contributors to inconsistency among models include differences in horizontal resolution and sub-gridscale parameterization schemes. Precipitation is a difficult field to observe and thus one must be somewhat cautious in using it for evaluation purposes. (Comparison of surface air temperature, sea level pressure and precipitation with alternate observational datasets is given in Subsection (c) below.) Using the Xie and Arkin (1997) observations, we find that in general the models simulate ~1 mm / day too much precipitation in mid-latitudes and somewhat too little in the tropics. The models correctly simulate the position of the annual mean ITCZ slightly north of the Equator, but a striking disagreement with observations occurs in the South Pacific. Here the model mean has a second maximum band parallel to the Equator, but the observations have a maximum with a northwest-southeast orientation north of New Zealand. The zonally averaged results show that this "double ITCZ" problem is shared by several of the models.

Figure 5 displays the annual net heat flux into the oceans. For the model mean, the largest oceanic heat gain is in the eastern tropical Pacific and the largest losses occur over the Kuroshio, the Gulf Stream and the southern Indian Ocean. sdm is highest in the loss areas. Observations of this field, and for its components to be discussed shortly, are best considered as estimates and this fact should be kept in mind when using them to evaluate models. Comparison with the observational estimates from da Silva et al. (1994) indicates that the model mean underestimates the heat losses from the oceans (as indicated by positive values of the differences between the model mean and observations) in regions where the ocean loses heat. The zonally averaged results show a large scatter among the models and a tendency for most of them to underestimate the heat loss from the oceans in the Northern Hemisphere and subtropical Southern Hemisphere and to overestimate the heat loss in the mid-latitudes of the Southern Hemisphere. There is also an indication that the models generally place the equatorial maximum in heat uptake slightly too far to the north.

It is useful to examine separately each of the components of the annual mean surface energy balance. These components -- net shortwave radiative flux, net longwave radiative flux, latent heat flux and sensible heat flux -- are presented in Figures 6-9 respectively for both land and ocean areas. Although the models all show the well known equator-pole contrast in annual mean shortwave flux, comparison of the model mean with the GEWEX Surface Radiation Budget (SRB) observational estimates (Whitlock et al. 1995) indicates that the shortwave flux is ~10% too high in the tropics and ~20% too low in temperate and polar latitudes, an error that is apparent in the zonally averaged results for most of the models. In addition, sdm values for shortwave flux reveal substantial inter-model variation within tropical latitudes in regions of strong convective activity. For longwave flux, both sdm and zonally averaged results show a very large scatter among the individual models, especially at high latitudes. The GEWEX SRB observations indicate that the model mean longwave flux is too large in the tropics and sub-tropics and too small in mid-latitudes and polar regions. The model mean errors in the downward shortwave flux and upward longwave flux will tend to oppose one another when the net surface radiative balance is computed. Latent heat flux exhibits the expected patterns including (for the model mean) large fluxes over warm cloud-free areas in the subtropical high pressure belts and low values over cloudy areas. Comparison with the NCEP reanalysis (Kalnay et al. 1996) indicate that the model mean has, with the exception of deserts, too little evaporation over land and too much over the oceans. The final and smallest component of the surface heat balance is sensible heat flux. Over land there is substantial inter-model variation of this quantity (sdm is nearly as large as the model mean flux) and, except for deserts, the models simulate values that are higher than those obtained by the NCEP reanalysis. Over ice-free oceans the model mean agrees much better with the reanalysis.

Finally we consider two additional surface fluxes that drive ocean circulation. For nearly all models the annual mean fresh water flux (Figure 10) shows large positive fluxes (i.e., precipitation exceeding evaporation) near the Equator with smaller positive values associated with mid-latitude storm tracks, and large negative values in the subtropical high pressure belts. The largest values of sdm for this quantity tend to be found in regions where the fresh water flux is into the oceans. Comparison with a crude observational estimate obtained by combining Xie-Arkin precipitation and NCEP reanalysis evaporation shows an error pattern very similar to that of precipitation, suggesting that deficiencies in the fresh water flux arise primarily from simulation of precipitation. For zonal annual mean surface wind stress (Figure 11) the largest values of the sdm over the oceans are found in the vicinity of Antarctica, the area where the model mean differs most from the NCEP reanalysis. These problems apparently result from the difficulties the models have in simulating the depth and position of the Antarctic trough in mean sea level pressure.

Atmosphere: zonally averaged quantities

We now turn to three-dimensional quantities, presented here (after zonal averaging) as latitude-height cross sections. Figure 12 shows zonally averaged annual mean air temperature. The pattern of model mean isotherms is qualitatively close to observations, but compared with the ECMWF/ERA reanalysis, the model mean is generally too cold in the troposphere and polar stratosphere and too warm at lower latitudes in the stratosphere. The magnitude of these errors is comparable to sdm, implying that the models produce fairly consistent simulations of temperature and that the errors are common to most of the models. Results for the individual models at 925 hPa confirm this situation for the cold bias at low levels. Corresponding results for specific humidity (Figure 13) display a fairly systematic underestimate in the low latitude troposphere, although the departure of the model mean from ECMWF/ERA reanalysis is rather small (~1 g / kg) and the pattern of the model mean in latitude-height space is again quite similar to observations.

Figure 14 shows zonally averaged annual mean zonal wind. In the lower troposphere the models agree rather well with each other and with the ECMWF/ERA reanalysis (to within ~2 m / s) except in the vicinity of the Antarctic Trough. Both sdm and the difference between the model mean and observations exhibit a noticeable increase with height. As a result the model simulations at upper levels depart qualitatively as well as quantitatively from observations, e.g., the double jet structure of the Southern Hemisphere is not well captured by the model mean. Results for the individual models at 200 hPa show the large scatter and further illustrate the problems in the Southern Hemisphere. The mass streamfunction (Figure 15) exhibits moderate inter-model variance at most latitudes and altitudes, with sdm ~ 20% of model mean values, and results for the individual models at 600 hPa show qualitative consistency in the size and position of the annual mean Hadley and Ferrel cells. Comparison of the model mean with ECMWF/ERA reanalysis indicates that the model-simulated Hadley circulation is in general too weak.

Ocean quantities

Turning to ocean variables, we examine the annual mean temperature (Figure 16) and salinity (Figure 17) at 1000 meters depth. (Sea surface temperature is closely coupled to surface air temperature over the oceans and is not explicitly discussed in this report.) For temperature at this level the models are generally consistent in their simulations (sdm < 1 K) except in the North Atlantic, subtropical Pacific and Indian Oceans, and in the Arabian Sea. Available observations (Levitus and Boyer 1994) indicate that the model mean is too warm over most of the ocean. The zonally averaged results show that outside the polar regions, all but one of the models simulate 1000 meter temperatures that are at or above (by up to ~2 K) the observations. An overly diffusive thermocline may be the root of this problem. The corresponding results for salinity exhibit relatively large sdm values and some spectacular departures from observations for individual models. Compared with observations (Levitus et al. 1994) the model mean is too saline in the Atlantic and Southern Oceans, and not saline enough in the Arabian Sea and Eastern Atlantic.

For the annual means of barotropic streamfunction (Figure 18), global overturning streamfunction (Figure 19) and Atlantic overturning streamfunction (Figure 20) we use three-panel displays because there are no complete observations of these quantities. Nevertheless it is noteworthy that the model means for all three agree qualitatively with conventional wisdom among oceanographers. Quantitative disagreement among models is most striking for the barotropic streamfunction off Antarctica, the global overturning streamfunction in the Southern Hemisphere and the Atlantic overturning streamfunction at ~1000 meters depth.

Poleward heat transport by the global ocean is given in Figure 21. In the upper left-hand panel, the upper dashed line is the model mean plus one sdm and the lower dashed line is the model mean minus one sdm. The model mean, which is not plotted, is half way between the two dashed lines. Observations of Trenberth and Solomon (1994) are shown as a bold solid line in both the upper-left and bottom panels. From these observations, it appears that over most of the ocean the model-simulated transport is generally too weak. The observations are uncertain, however. For example, an update (Trenberth 1998) of the Trenberth and Solomon data reduces the peak ocean heat transport in the Southern Hemisphere by nearly a factor of 2.

c. Global statistics

To begin to obtain a more quantitative picture of how well (or how poorly) the models agree with observations, we use a diagram developed by Taylor (2000). This technique, and others exhibited in this section, are part of the climate diagnostic software developed at the Program for Climate Model Diagnosis and Intercomparison (PCMDI). Selected PCMDI software tools and their documentation can be downloaded from the Web site We intend to make the software tools that produced Figures 22, 24, etc., public via this Web site.

Figure 22 is a Taylor diagram of the total spatial and temporal variability of three fields: surface air temperature, sea level pressure and precipitation. The variability shown in the figure includes the seasonal cycle but excludes the global mean. The radial coordinate is the ratio of modeled to observed standard deviation. The cosine of the angle of the model point from the horizontal axis is the spatio-temporal correlation between model and observation. When plotted in these coordinates, the diagram also indicates the root-mean-square difference between model and observation: this difference is proportional to the linear distance between the model point and the "observed" point lying on the horizonal axis at unit distance from the origin. Thus the diagram enables visualization of three quantities -- standard deviation normalized by observation, correlation with observation, and r.m.s. difference from observation -- in a two-dimensional space. This is possible because the three quantities are not independent of each other (Taylor 2000). Loosely speaking, the polar coordinate of the diagram gives the correlation between model and observation for space-time variations but contains no information about the amplitude of the variations, the radial coordinate compares the modeled and observed amplitude of the variations, and the distance between each model point and the "observed" point gives the r.m.s. model error.

The most striking aspect of the figure is the way it separates the three fields into separate groups. This separation agrees with the familiar qualitative statement that models simulate temperature best, sea level pressure less well, and precipitation worst (e.g., Gates et al. 1996). For surface air temperature, all models a achieve a correlation with observation > 0.93, and the standard deviation of space-time variations is within ± 15% of the observed value in nearly all models. (This achievement is especially noteworthy for the non-flux-adjusted models, which have no explicit constraints requiring surface temperatures to match observations.) For modeled sea level pressure, the correlation with observation falls mainly in the range 0.7-0.9; for modeled precipitation it falls in the range 0.4-0.7. The standard deviation of space-time variations is also modeled less well for precipitation and sea level pressure than it is for surface air temperature.

To provide a sense of observational uncertainty, we include two alternate observed data sets in Figure 22: ECMWF/ERA reanalysis ("e") and NCEP reanalysis ("n"). These data sets are plotted as if they were model output. For all three fields, the alternate observed data sets fall closer to the baseline "observed" point than any model does -- but not much closer than the closest model. For precipitation and surface air temperature, the r.m.s. difference between either of the reanalysis data sets and the baseline observations is more than half the smallest r.m.s. model error. Whether this result says something positive about the models or negative about reanalysis is unclear. More comparison between alternate sets of observations is provided in the following figures.

Figure 22 displays the total space-time variance of the model runs. It is also useful to examine individual components of the variance. Figure 23 shows how we divide a surface field (either model-simulated or observed) into components. Our procedure follows the usual practice in climatology, obtaining representations of increasingly detailed space-time behavior:

  1. the global and annual mean (not included in Figure 22)
  2. the zonal and annual mean, giving variations with latitude
  3. the annual mean deviations from the zonal mean, giving variations with longitude (mainly land-sea contrast)
  4. the annual cycle of the zonal mean, giving seasonal variations as a function of latitude
  5. the annual cycle of deviations from the zonal mean, giving the remaining variance (apart from interannual variations, which are not considered here)

In Figures 24-26 we divide the r.m.s. difference between each model and observation ("total error" of the model) into these components. The error component associated with the global and annual mean is called the bias, and the remaining error (the sum of components 2-5) is called the pattern error. The figures give -- from top to bottom -- the total error, the bias, the pattern error, and the remaining error components. For each component, errors are normalized by that component's observed standard deviation. The error amounts are color-coded so that blue indicates a small error compared with the observed standard deviation and red indicates a large error compared with the observed standard deviation.

Applying this metric to surface air temperature (Figure 24), we find that nearly all error components in nearly all models are small, particularly the annual and zonal mean components. For three of the models -- ECHAM4+OPYC3, HadCM2 and HadCM3 -- all of the error components are about as small as for ERA and NCEP renalyses when the latter are included as extra "models". Turning to sea level pressure (Figure 25), we find that nearly all models have small errors for global and zonal means, but several of the models have large errors for more detailed space-time patterns. Surprisingly, even the NCEP reanalysis has a large "error" in one component (annual cycle of the zonal mean) when compared with the baseline observations from ERA. Turning to precipitation (Figure 26), we find that model errors are concentrated in the annual cycle of deviations from the zonal mean. Large errors in this component appear for all models except HadCM2 and the two reanalyses. Errors in the global and zonal means (including the seasonal cycle of the zonal mean) are small for all models. This situation is an improvement over earlier models in which even the global and annual mean precipitation value could be substantially erroneous, e.g., ~30% greater than observed in Version 1 of the NCAR Community Climate Model (Covey and Thompson 1989, Table I).

Figures 24-26 can also be used to sort models into flux-adjusted and non-flux-adjusted classes, as explained in the figure captions. Differences between these two classes of models are not obvious from the figures. This result reinforces the inferences made above that in modern coupled GCMs the performance differences between flux-adjusted and non-flux-adjusted models are relatively small (see also Duffy et al. 2000). Evidently, for at least the century-timescale integrations used to detect and predict anthropogenic climate change, several modeling groups now find it possible to dispense with flux adjustments. This development represents an improvement over the situation a decade ago, when most groups felt that coupled models could not satisfactorily reproduce the observed climate without including arbitrary (and often nonphysical) adjustment terms in their equations.

3. Increasing-CO2 climate

To begin our discussion of model responses to 1% per year increasing atmospheric CO2, Figure 27 shows global and annual mean changes in surface air temperature and precipitation under this scenario, i.e., differences between the increasing-CO2 and control runs. The surface air temperature results are similar to those shown in the 1995 IPCC report (Kattenberg et al. 1996, Figure 6.4). The models reach about 2 °C global mean surface warming by the time CO2 doubles around year 70, and the range of model results stays within roughly ± 25% of the average model result throughout the experiments. This rather narrow range contrasts with a greater spread of model output for experiments in which the models are allowed to reach equilibrium. The typical statement for the equilibrium results (from IPCC reports and similar sources) is that the surface warms by 3.0 ± 1.5 °C under doubled CO2. While it is understandable that the ultimate equilibrium warming is greater than the warming at the moment that CO2 reaches twice is initial value, it may seem surprising that the dispersion of results from different models -- a factor of 3 in the equilibrium experiments -- is reduced to ± 25% in the time-evolving (or "transient") experiments considered here.

The precipitation responses of the models span a much wider range than the temperature responses. As shown in Figure 27, the increase in global and annual mean precipitation at the time of CO2 doubling varies from about 0.03 to 0.15 mm / day, a factor of 5. Although the global means of both precipitation and surface air temperature increase in all of the enhanced-CO2 simulations, the correlation between precipitation increases and temperature increases is weak, as is the correlation between precipitation inreases and the control run temperatures shown in the top panel of Figure 1.

Turning to geographical and latitude-height distributions, we recall that the upper-right panels of Figures 2-21 display changes simulated by the perturbation experiments. Contour lines give the model-mean difference between the first 20 year time mean and the last 20 year time mean of the 80 year simulations. This difference is the change over roughly 60 years during which time atmospheric CO2 nearly doubles. The intermodel standard deviation (sdm) of these 60 year differences is used to normalize the model mean differences. Absolute values of the normalized difference greater than one are shaded and indicate that the changes simulated by the models have a reasonable degree of consistency and therefore one might have increased confidence in the results.

For surface air temperature (Figure 2) there is a globally averaged model mean increase of 1.73°C. The largest changes occur in the polar regions and over land areas. The increases exceed sdm by a factor of two over most of the globe. For mean sea level pressure (Figure 3) the polar regions and land areas exhibit a decrease and the oceans tend to exhibit an increase, an indicator of monsoon-like circulations developing as a result of land areas warming faster than ocean areas. The largest values of normalized sea level pressure difference are generally found in polar areas. Changes in precipitation (Figure 4) show an increase over most of the globe. The globally averaged model mean increase is 0.07 mm / day. Only a few areas -- generally in the sub-tropics -- exhibit a decrease. The largest values of normalized difference occur in high mid-latitudes and probably have an association with storm tracks. Changes in net heat flux (Figure 5) are generally positive, indicating a slight gain of heat by the oceans. The largest values are present off the east coast of North America and around Antarctica. The mean model change exceeds sdm over only a small area off the coast of Antarctica, indicating that although models generally transport heat into the oceans in global warming scenarios, the locations at which they do so vary. For the radiation components of the heat balance (Figures 6-7), the short wave flux exhibits little change while the upward long wave flux is generally reduced by amounts that exceed the sdm. Latent heat flux (Figure 8) exhibits a general increase in accord with the increase in surface temperature while sensible heat flux (Figure 9) decreases. The models simulate an increase in the fresh water flux in the tropics and high mid-latitudes and a decrease in sub-tropics (Figure 10). The changes are similar in sign to the control run results, indicating that dry areas will become drier and wet areas wetter. Large normalized differences occur only off Antarctica and in the Southern Hemisphere high pressure belt, however. For zonal wind stress (Figure 11) the model-mean change is small compared with sdm nearly everywhere.

Changes in model mean zonally averaged temperature as a function of height (Figure 12) show the expected pattern of warming in the troposphere and lower stratosphere and cooling in the remainder of the stratosphere. Changes in large areas of the troposphere and the stratosphere are more than twice sdm. Model mean zonally averaged specific humidity (Figure 13) increases everywhere and its changes are also large compared with sdm, consistent with the temperature changes. For mean zonal wind (Figure 14) the main change is an increase in speed of the mid-latitude jets; the models tend to agree on this change as indicated by the high levels of normalized difference in the jet cores. For the mass transport streamfunction (Figure 15) the model mean shows a poleward shift of the Ferrel cells and a slight strengthening of the Hadley Circulation.

Changes in model mean ocean temperature and salinity at 1000 meters depth are shown in Figures 16 and 17 respectively. In general, these changes are small. The models do produce consistent simulations of slightly increased temperature and salinity off the coast of Antarctica. The model mean barotropic streamfunction (Figure 18) decreases off Antarctica, indicating a slower Antarctic Circumpolar Current. As a result of the large scatter among models, however, the normalized differences are generally small. Model mean global overturning streamfunction (Figure 19) and Atlantic overturning streamfunction (Figure 20) decrease in magnitude, and there is a reasonable degree of agreement among the models. Results for ocean heat transport (Figure 21) are displayed differently: the solid line represents the model mean difference and the dashed lines are one sdm above and below the model mean. The enhanced greenhouse effect acts to reduce the ocean heat transport, consistent with the general slowdown in ocean circulation depicted in Figures 18-20.

4. Conclusions

Comparison of the CMIP2 control run output with observations of the present day climate reveals improvements in coupled model performance since the IPCC's mid-1990s assessment (Gates et al. 1996). The most prominent of these is a diminishing need for arbitrary flux adjustments at the air-sea interface. About half of the newer generation of coupled models omit flux adjustments, yet the rates of "climate drift" they exhibit (Figure 1) are within the bounds required for useful model simulations on time scales of a century or more. The flux-adjusted models exhibit less drift on average, however, and thus agree better with the limited information we possess on climate variations before the Industrial Revolution (e.g., Jones et al. 1998; Mann et al. 1999). Both flux-adjusted and non-flux-adjusted models produce a surprising variety of time-averaged global mean temperatures, from less than 12°C to over 16°C. Perhaps this quantity has not been the subject of as much attention as it deserves in model development and evaluation.

The spatial patterns of model control run output variables display numerous areas of agreement and disagreement with observations (Figures 2-21). As always, it is difficult to determine whether or not the models are "good enough" to be trusted when used to study climate in the distant past or to make predictions of the future. The global statistics shown in Figures 22-26 provide some encouragement. They indicate that the difference between a typical model simulation and a baseline set of observations is not much greater than the difference between different sets of observations. To the extent that different sets of observations (including model-based reanalyses) are equally reliable, this result implies that coupled GCM control runs are nearly as accurate as observational uncertainty allows them to be -- at least for the quantities highlighted by our global statistics.

The CMIP2 models do not yield the same simulation of climate change when they are all subjected to an identical scenario of 1% per year increasing CO2. The range of model-simulated global mean warmings, however, is less than the factor of 3 (1.5 - 4.5°C) uncertainty commonly cited for equilibrium warming under doubled CO2. Part of the explanation could involve the behavior of models not included in this report, which may give more extreme results than the CMIP2 models. The main reason for the narrower range, however, is that the response time of the climate system increases with increasing climate sensitivity (Hansen et al. 1984, 1985; Wigley and Schlesinger 1985). This relationship has been investigated in detail for the CMIP2 models by Raper et al. (2000). It introduces a partial cancellation of effects: models with larger sensitivity (greater equilibrium warming to doubled CO2) are farther from equilibrium than less-sensitive models at any given time during the increasing-CO2 scenario. As a result, the uncertainty in model predictions of future climate response may be less than the uncertainty in future anthropogenic forcing (Hansen et al. 1997). On the other hand, simultated precipitation increases differ greatly among the CMIP2 models and appear to have no simple relationship with simulated temperatures.

Expansion of the CMIP model output set has begun under auspices of the JSC/CLIVAR Working Group on Coupled models, and analysis of the existing database is continuing. We encourage all interested scientists to contribute to this ongoing effort.

Acknowledgments. We owe thanks to Clyde Dease and Anna McCravy of the PCMDI computations staff for assistance with data processing and Web publication respectively, and of course to the modelers whose contributions have made CMIP possible. CC also thanks his fellow IPCC Lead Authors for extensive discussions of climate model evaluation. This work was performed under auspices of the U.S. Department of Energy by University of California Lawrence Livermore National Laboratory under contract No. W-7405-Eng-48.


Barnett, T. P., 1999: Comparison of near-surface air temperature variability in 11 coupled global climate models. J. Climate, 12, 511-518.

Bell, J., P. B. Duffy , C. Covey, L. Sloan, and CMIP investigators, 2000: Comparison of temperature variability in observations and sixteen climate model simulations. Geophys. Res. Lett., 27, 261-264.

Covey, C., and S. L. Thompson, 1989: Testing the effects of ocean heat transport on climate. Global and Planetary Change, 75, 331-341.

Covey, C., and Coauthors, 2000: The seasonal cycle in coupled ocean-atmosphere general circulation models. Climate Dynamics, in press [also available as PCMDI Report No. 51].

Crowley, T. J., 2000: Causes of climate change over the past 1000 years. Science, 289, 270-277.

da Silva, A. M., C. C. Young, and S. Levitus, 1994: Atlas of surface marine data 1994, Volume 1: Algorithms and procedures. NOAA Atlas NESDIS 6, U. S. Department of Commerce.

Duffy, P. B., J. Bell, C. Covey, L. Sloan, and CMIP investigators, 2000: Effect of flux adjustments on temperature variability in climate models. Geophys. Res. Lett., 27, 763-766.

Gates, W. L., and Coauthors, 1996: Climate models - Evaluation. Climate Climate 1995: The Science of Climate Change, J. T. Houghton et al., Eds., Cambridge University Press, 229-284.

Gates, W. L., and Coauthors, 1999: An overview of the results of the Atmospheric Model Intercomparison Project (AMIP I). Bull. Amer. Meteor. Soc., 80, 29-55.

Gibson, J.K., P. Kallberg, S. Uppala, A. Hernandez, A. Nomura, and E. Serrano, 1997: ERA description. ECMWF Reanalysis Project Report Series No.1, European Centre for Medium-range Weather Forecasts, Reading, UK, 66 pp.

Hansen, J., A. Lacis, D. Rind, G. Russell, P. Stone, I. Fung, R. Ruedy, and J. Lerner, 1984: Climate sensitivity: Analysis of feedback mechanisms. Climate Processes and Climate Sensitivity, Geophysical Monograph Series, Vol. 29, J. E. Hansen and T. Takahashi, Eds., American Geophysical Union, Washington, DC, 130-163.

Hansen, J., G. Russell, A. Lacis, I. Fung, D. Rind, and P. Stone, 1985: Climate response times: Dependence on climate sensitivity and ocean mixing. Science, 229, 857-859.

Hansen, J., M. Sato, A. Lacis, and R. Ruedy, 1997: The missing climate forcing. Phil. Trans. R. Soc. Lond. B, 352, 231-240.

Jones, P. D., K. R. Briffa, T. P. Barnett, and S. F. B. Tett, 1998: High-resolution palaeoclimatic records for the last millennium: Interpretation, integration and comparison with general circulation model control-run temperatures. The Holocene, 8, 455-471.

Jones, P. D., M. New, D. E. Parker, S. Martin, and I. G. Rigor, 1999: Surface air temperature and its changes over the past 150 years. Rev. Geophys., 37, 173-199.

Kalnay, E., and Coauthors, 1996: The NCEP/NCAR 40-year reanalysis project. Bull. Amer. Meteor. Soc., 77, 437-471.

Kattenberg, A., and Coauthors, 1996: Climate models - Projections of future climate. Climate Climate 1995: The Science of Climate Change, J. T. Houghton et al., Eds., Cambridge University Press, 285-357.

Lambert, S. J., and G. J. Boer, 2000: CMIP: Evaluation and intercomparison of coupled climate models. Climate Dynamics, in press.

Levitus, S., R. Burgett, and T. P. Boyer, 1994: World Ocean Atlas 1994 Volume 3: Salinity. NOAA Atlas NESDIS 3, 99 pp.

Levitus, S., and T. P. Boyer, 1994: World Ocean Atlas 1994 Volume 4: Temperature. NOAA Atlas NESDIS 4, 117 pp.

Mann, M. E., R. S. Bradley, and M. K. Hughes, 1999: Northern Hemisphere temperatures during the past millennium: Inferences, uncertainties, and limitations. Geophys. Res. Lett., 26, 759-762.

Meehl, G. A., G. J. Boer, C. Covey, M. Latif, and R. J. Stouffer, 2000: The Coupled Model Intercomparison Project (CMIP). Bull. Amer. Meteor. Soc., 81, 313-318.

Santer, B. D., and Coauthors, 2000: Interpreting differential temperature trends at the surface and in the lower troposphere. Science, 287, 1227-1232.

Raper, S. C. B., J. M. Gregory, and R. J. Stouffer, 2000: The role of climate sensitivity and ocean heat uptake on AOGCM transient temperature and thermal expansion response. J. Climate, submitted.

Taylor, K. E., 2000: Summarizing in a single diagram multiple aspects of model performance. J. Geophys. Res., submitted [also available as PCMDI Report No. 55].

Trenberth, K.E., and A. Solomon, 1994: The global heat balance: Heat transport in the atmosphere and ocean. Climate Dynamics, 10, 107-134.

Trenberth, K. E., 1998: The heat budget of the atmosphere and ocean, in Proceedings of the First WCRP International Conference on Reanalysis, WCRP-104, WMO/TD-NO. 876, pp. 17-20.

Tett, S. F. B., P. A. Stott, M. R. Allen, W. J. Ingram, and J. F. B. Mitchell, 1999: Causes of twentieth-century temperature change near the Earth's surface. Nature, 399, 569-572.

Whitlock, C. H., and Coauthors, 1995: The first global WCRP shortwave radiation budget data set. Bull. Amer. Meteor. Soc., 76, 905-922.

Wigley, T. M. L., and M. E. Schlesinger, 1985: Analytical solution for the effect of increasing CO2 on global mean temperature. Nature, 315, 649-652.

Xie, P., and P. Arkin, 1997: Global precipitation: a 17-year monthly analysis based on gauge observations, satellite estimates, and numerical model outputs. Bull. Amer. Meteor. Soc., 78, 2539-2558.

Figure Captions

Fig. 1. Globally averaged annual mean surface air temperature (top) and precipitation (bottom) from the CMIP2 control runs.

Fig. 2. Summary of long-term time means for surface air temperature (K). The upper-left panel gives the control run 80-year mean averaged over all models (contours) and the intermodel standard deviation (color shading). The lower-left panel gives observed values (contours) and the difference between the control run model mean and the observations (color shading). The lower-right panel gives zonal averages for the individual model control runs and the observations. The upper-right panel gives the average over all models of the difference between the last 20-year mean and the first 20-year mean from the 80-year perturbation simulations, in which atmospheric carbon dioxide increases at a rate of 1% per year (contours), together with this difference normalized by the corresponding intermodel standard deviation (color shading).

Fig. 3. Same as Fig. 2 for mean sea level pressure (hPa).

Fig. 4. Same as Fig. 2 for precipitation (mm / day).

Fig. 5. Same as Fig. 2 for net surface heat flux (W / m2).

Fig. 6. Same as Fig. 2 for short wave radiative flux (W / m2).

Fig. 7. Same as Fig. 2 for long wave radiative flux (W / m2).

Fig. 8. Same as Fig. 2 for latent heat flux (W / m2).

Fig. 9. Same as Fig. 2 for sensible heat flux (W / m2).

Fig. 10. Same as Fig. 2 for fresh water flux (mm / day).

Fig. 11. Same as Fig. 2 for zonal wind stress (100 N / m2).

Fig. 12. Same as Fig. 2 for zonally averaged temperature (K).

Fig. 13. Same as Fig. 2 for zonally averaged specific humidity (g / kg).

Fig. 14. Same as Fig. 2 for zonally averaged zonal wind (m / s).

Fig. 15. Same as Fig. 2 for mass streamfunction (1010 kg / s)

Fig. 16. Same as Fig. 2 for ocean temperature at 1000 meters depth (K).

Fig. 17. Same as Fig. 2 for ocean salinity at 1000 meters depth (psu).

Fig. 18. Summary of long-term time means for the barotropic streamfunction (Sv). The upper-left panel gives the control run 80-year mean averaged over all models (contours) and the intermodel standard deviation (color shading). The bottom panel gives zonal averages for the individual model control runs and the model mean. The upper-right panel gives the average over all models of the difference between the last 20-year mean and the first 20-year mean from the 80-year perturbation simulations, in which atmospheric carbon dioxide increases at a rate of 1% per year (contours), and this difference normalized by the corresponding intermodel standard deviation (color shading).

Fig. 19. Same as Fig. 18 for global overturning streamfunction (Sv).

Fig. 20. Same as Fig. 18 for Atlantic overturning streamfunction (Sv).

Fig. 21. Summary of long-term time means for northward global ocean heat transport (PW). The upper-left panel gives the observed values as a solid line; the dashed lines are the model mean plus and minus one intermodel standard deviation. The bottom panel gives zonal averages for the individual model control runs and the model mean. The upper-right panel gives the average over all models of the difference between the last 20-year mean and the first 20-year mean from the 80-year perturbation simulations, in which atmospheric carbon dioxide increases at a rate of 1% per year (solid line), and this difference plus and minus one corresponding intermodel standard deviation (dashed lines).

Fig. 22. Error statistics of surface air temperature, sea level pressure and precipitation. The radial coordinate gives the magnitude of total standard deviation, normalized by the observed value, and the angular coordinate gives the correlation with observations. It follows that the distance between the OBSERVED point and any model's point is proportional to the r.m.s. model error (Taylor 2000). Numbers indicate models counting from left to right in Figures 24-26. Letters indicate alternate observational data sets compared with the baseline observations: e = 15-year ECMWF/ERA reanalysis ("ERA15"); n = NCEP reanalysis.

Fig. 23. Example showing division of a model output field into space and time components.

Fig. 24. Components of space-time errors in the climatological annual cycle of surface air temperature. Shown are the total error, the global and annual mean error ("bias"), the total r.m.s. ("pattern") error, and the following components (explained in Figure 23): zonal and annual mean (""); annual mean deviations from the zonal mean (""), seasonal cycle of the zonal mean ("") and seasonal cycle of deviations from the zonal mean (""). For each component, errors are normalized by the component's observed standard deviation. The two left-most columns represent alternate observationally based data sets, ECMWF/ERA and NCEP reanalyses, compared with the baseline observations (Jones et al. 1999). Remaining columns give model results: the 10 models to the left of the second thick vertical line are flux adjusted and the 6 models to the right are not.

Fig. 25. Same as Figure 24 for mean sea level pressure. Baseline observations are from ECMWF/ERA reanalysis.

Fig. 26. Same as Figure 24 for precipitation. Baseline observations are from Xie and Arkin (1997).

Fig. 27. Globally averaged difference between increasing-CO2 and control run values of annual mean surface air temperature (top) and precipitation (bottom) for the CMIP2 models. Compare with Figure 1, which gives control run values.