Coupled ocean-atmosphere climate simulations compared with simulations using prescribed sea surface temperature:  Effect of a "perfect ocean"

Curt Covey [1] , Krishna M. Achutarao, Peter J. Gleckler, Thomas J. Phillips, Karl E. Taylor, Michael F. Wehner [2]

Program for Climate Model Diagnosis and Intercomparison, L-103, Lawrence Livermore National Laboratory, Livermore, CA 94550, USA

Global and Planetary Change ( 41: 1-14 (2004)

PDF copies available online at


Results from atmospheric general circulation models, run with sea surface temperatures (SSTs) and sea ice amounts set to observed values for the period 1979-1994, can be compared with "control run" simulations by the same atmosphere models coupled to interactive ocean and sea ice models.  The simulations with prescribed SSTs and sea ice are available from the Atmospheric Model Intercomparison Project (AMIP), and coupled ocean-atmosphere simulations are available from the Coupled Model Intercomparison Project (CMIP).  We compare CMIP runs from two coupled models sharing a common atmosphere component (but different ocean components) with the atmosphere component's AMIP run.  All three simulations have similar errors that presumably originate in the atmosphere component.  Replacing the observed SSTs and sea ice amounts in the AMIP simulation with the interactive sub-models used in the CMIP simulations tends to degrade the level of agreement with climate observations.  Increases in root-mean-square errors, however, are mostly less than 30% and often less than 10% of the magnitude of natural climate variations.  Exceptions to this rule occur mainly in the tropics, most notably for geopotential height at 500 hPa and for temperature near the tropopause.  These variables show increases in RMS error that are comparable to observational standard deviations.  The coupled model simulations are taken from the end of 300-year control runs without flux "corrections" at the ocean-atmosphere interface.  Their similarity to results from the prescribed-SST atmosphere model implies that modern coupled models can maintain stable multi-century simulations without flux adjustments.

1.  Introduction

            Models of the global climate system are used to interpolate and explain paleodata related to past climate states, to study climate processes acting in the present and the recent past (including the effects of human-produced greenhouse gases and aerosols), and to forecast or "project" future climate.  Typical models include a three-dimensional representation of atmospheric motions, i.e., an atmosphere general circulation model (GCM), as one component.  Often an atmosphere GCM is run with lower boundary conditions that set sea surface temperatures and sea ice amounts equal to observed values.  For many purposes, however-including the simulation of global warming due to human-produced greenhouse gases-climate models must include interactive sub-models of the oceans and sea ice in addition to an atmosphere component.

These coupled climate models have exhibited a problem not evident in the behavior of atmospheric GCMs run with prescribed SSTs and sea ice.  Errors in fluxes of heat, momentum and water across the ocean-atmosphere interface can lead to "climate drift" away from observations.  Nonphysical, ad hoc flux adjustments were initially regarded as necessary to correct the problem.  The Second Assessment Report of the Intergovernmental Panel on Climate Change (IPCC) stated, "Flux adjustments are relatively large in the models that use them, but their absence affects the realism of the control climate and the associated feedback processes" (Gates et al., 1996).  The most recent IPCC Assessment Report, however, states, "Confidence in model projections is increased by the improved performance of several models that do not use flux adjustment" (McAvaney et al., 2001).

In this report we examine the performance of two recently developed coupled models that do not use flux adjustment.  We compare their results with output from their common atmosphere sub-model when it is driven by prescribed SSTs and sea ice.  From the difference between this output (with the atmosphere model constrained by surface observations) and each coupled model's simulation at the end of a 300 year control run, we quantify the extent of climate drift in the coupled models.

Boville and Hurrell (1998) compared simulations of two of the three models considered here: (1) the coupled Climate System Model and (2) its atmospheric component driven by prescribed SSTs and sea ice.  They examined traditional diagnostics of the climatological mean December-January-February and June-July-August states of the models.  In this report we focus on the root-mean-square errors of the models over broad latitude bands.  This approach complements the traditional diagnostics, providing more quantitative metrics of model performance but forgoing the benefits of expert pattern recognition.  Our conclusions (see below) reinforce those of Boville and Hurrell, that "[d]ifferences between the simulations are remarkably small".

2.  Model simulations and diagnostic procedure

We take model output from databases established by the Atmospheric Model Intercomparison Project (Gates et al., 1999; see also and the Coupled Model Intercomparison Project (Meehl et al., 2000; Covey et al., 2003; see also  A recent expansion of the CMIP database allows detailed study of model output.  The Climate System Model (CSM; Boville and Gent, 1998) and the Parallel Climate Model (PCM; Washington et al., 2000) were the first contributors to the expanded database.  Here we compare CSM and PCM control run atmospheric output with the AMIP output of their common atmospheric component, the Community Climate Model version 3 (CCM3).  Detailed documentation of all three models can be found on the AMIP and CMIP Web sites referenced above (which contain links to the CSM and PCM Web sites).  There is only one nontrivial difference between the version of the CCM3 contributed to AMIP and the earlier versions of the atmosphere model used in the CSM and PCM: the correction of a one-line coding error in the calculation of convective available potential energy (Byron Boville and Thomas Bettge, personal communication).  The substantial differences between the CSM and the PCM arise instead from their use of independently developed ocean and sea ice models.

The CCM3 AMIP simulation has SSTs and sea ice amounts prescribed to monthly mean observed values for the period 1979-1994.  CMIP control run simulations have external climate forcing (solar irradiance at the top of the atmosphere, atmospheric carbon dioxide concentration, etc.) held constant, as in AMIP, but SSTs and sea ice amounts calculated as part of the simulation.  Perfect agreement with observations could not occur in either AMIP or CMIP even if the models were perfect.  In AMIP, the absence of time-varying aerosol amounts means that the effects of major volcanic eruptions during the simulated time period (El Chichon in 1982 and Pinatubo in 1991) are missing.  In CMIP control runs, simulations represent long term climate equilibrium states that cannot be matched with specific calendar years. For example, major El Nino / Southern Oscillation (ENSO) events in 1982 and in the early 1990s should appear in AMIP output as a consequence of the specified SSTs, but the CMIP control runs at best can only simulate the general statistics of ENSO events.  In addition, neither AMIP nor CMIP included changes in natural climate forcing (such as varying solar brightness and aerosols from volcanic eruptions) or anthropogenic climate forcing (such as human-produced greenhouse gases and aerosols).  The SST and sea ice response to both types of forcing is implicitly included in the AMIP experiments.  For coupled ocean-atmosphere model control runs, however, we expect some divergence between simulations and the actual time-evolving climate of the late 20th century, as represented in both the observations and the AMIP runs.  The extent to which this divergence affects our results is addressed in Section 3.4 below.

The diagnostics in this study focus on root-mean-square (RMS) errors for about two dozen atmospheric variables over large space and time scales.  Details of our procedure are given by Covey et al. (2002), from which Table 1 is taken.  This table gives the variables and the sources of observational data from which model errors are calculated.  RMS errors are calculated for combined land and ocean areas except for surface fluxes, for which only ocean areas are considered due to a lack of reliable data over land.  For each variable, we use two different observational datasets to give a crude estimate of observational error.  Model errors are calculated using primary observations.  Secondary observations are processed as additional "models" in some of the results presented below.  Observations of different variables span different times within the 1979-1994 AMIP period.  Comparison with the AMIP simulation of each variable is done by matching the time period for which observations exist.  Comparisons with CSM and PCM control runs are done for the last n years of the 300-year run, where n is the number of years spanned by the observations.  In some cases we also examine the last n - 50 years of the CSM control run to evaluate the robustness of our results to the arbitrary choice of control run time segment.

For each variable, we calculate both a global RMS error over all seasons and a set of RMS errors over broad latitude bands and separate seasons.  To allow comparable presentation of variables with different units, we normalize the errors by the standard deviations of the primary observations.  Each observational standard deviation is calculated over the same space-time region (i.e., the same latitude band and season(s)) as the quantity it normalizes.  We sometimes additionally separate the total RMS error into three components: a "bias" representing the spatial average (and in the all-seasons case, a time average also), a climatological pattern error component, and an interannual component.

3.  Results

3.1  RMS errors

Fig. 1 is a color-coded table or "portrait diagram" showing total RMS errors for the CCM3 AMIP run, the CSM CMIP control run, and the PCM CMIP control run.  The most prominent errors in all three simulations appear in temperature near the tropopause (ta at 200 hPa), especially in the Southern Hemisphere summer.  It should be noted that since the RMS errors are normalized by different factors for each space-time region, Fig. 1 cannot be used to determine the relative sizes of the absolute RMS errors.  For example, the figure leaves open the possibility that temperature near the tropopause has smaller RMS errors than temperature near the surface, but observed temperature near the surface has large standard deviations.  Examination of the numbers used in Fig. 1 reveals larger standard deviations in temperature near the surface; nevertheless, Fig. 2, which displays the situation in a more traditional plot, shows that the CCM3's largest zonal mean temperature errors occur at about the 200 hPa level at high southern latitudes, where the simulation is more than 10°C too cold.  This conclusion is independent of the set of observations (ECMWF or NCEP reanalysis) selected to compute the errors.  The CSM and PCM exhibit a similar temperature error structure to the CCM3, as we discuss below in connection with Fig. 4. 

Errors in wind and temperature are connected by the principle of geostrophic balance.  It might therefore seem surprising that 200 hPa temperature errors stand out in Fig. 1 while zonal wind errors at the same level appear to be small.  Fig. 3 addresses this apparent paradox by locating the CCM3's main wind errors above the 200 hPa level at which RMS errors were calculated for Fig. 1.  The largest errors in zonal wind (again, independent of which observational data set is used) occur at about the 100 hPa level in the Southern Hemisphere.  The fact that wind errors occur above temperature errors is a consequence of the thermal wind relationship, which gives wind in terms of the vertical integral of the horizontal temperature gradient.

Other notable errors appearing in Fig. 1-and in AMIP and CMIP runs in general, in addition to the simulations examined for this study-include total cloudiness (clt), relative humidity in the mid-troposphere (hur at 500 hPa), and meridional wind (va).  Somewhat smaller though still noteworthy are normalized errors in surface sensible heat flux (hfss) and meridional wind stress (tauv).  All of these errors are common to the three simulations and presumably originate in the atmosphere component of the models.  As we have noted, normalization can mask some information in Fig. 1.  In addition to the case of air temperature discussed above, errors in the meridional wind stress appear more important than those in zonal wind stress even though RMS error magnitudes are similar, because a larger observed standard deviation in the zonal component means a larger normalization factor.  An analogous point can be made about ocean latent and sensible heat fluxes.

In contrast with temperature near the tropopause, temperature near the surface (ta at 850 hPa, and tas) exhibits only small errors in comparison with its observed standard deviation.  This result is expected for the CCM3, for which observed sea surface temperatures are supplied as a lower boundary condition, but outside the tropics (20S - 20N) it also holds for the CSM and the PCM.  Within the tropics, however, CSM and PCM errors are greater than the observed standard deviation.  Examination of the tropics in each of the three parts of Fig. 1 shows that the coupled models also have generally increased errors over the full range of variables examined.

One would of course expect that replacement of the "perfect ocean" in the CCM3 AMIP run with either the CSM's or the PCM's ocean and sea ice models would degrade the level of agreement between simulations and observations.  What surprises us is how small the amount of degradation is.  For example, Fig. 4 shows that in the CCM3 simulation, the errors in 200 hPa temperatures at high Southern latitudes evident in Fig. 2 are nearly uniform in longitude; smaller errors with the same sign appear at other latitudes with more evident zonal structure.  The CSM and PCM display error patterns remarkably similar to those of the CCM3.  In the PCM simulation, however, additional small errors of opposite sign (overly warm temperatures) appear at a few locations.

3.2  Climatological error differences and components

To better distinguish results from the coupled model control runs and the CCM3 AMIP run, we present differences of the normalized RMS errors (coupled model minus CCM3) in the middle and lower panels of Figs. 5-6. These figures also split the seasonal cycle climatology portion of the total RMS error shown in Fig. 1 into two components.  (The remaining portion, the interannual component, is shown in Figs. 8-9, discussed below.)  Fig. 5 shows the "bias" or area- and time-average component.  Fig. 6 shows the centered pattern error, which is computed after first removing the bias.  The pattern error may include both space and time dimensions.  The time dimension, however, contributes to the pattern error only in the top (All Seasons) row in Fig. 6.  This row includes amplitude and phase errors in the climatological mean seasonal cycle.  The other rows of Fig. 6 include only spatial pattern errors for the particular seasons shown.  Interannual variations, not included in Fig. 5-6, provide an additional small component of RMS error that will be discussed later.  Details of our component splitting, including the relevant equations, are given by Covey et al. (2002).

With the exception of temperature near the tropopause (ta at 200 hPa), bias errors in the CCM3 simulation are nearly always small compared with observational standard deviations.  Bias errors typically increase in the coupled model simulations, as expected.  Nevertheless, for most variables, seasons and latitude bands, either the increase in normalized error is less than 0.1 (white boxes in Fig. 5(b)-(d)) or the error actually decreases (blue boxes).  This means that the degradation caused by replacing the "perfect ocean" in the CCM3 simulation by interactive ocean and sea ice models is generally less than 10% of the observational standard deviations.  Exceptions to this rule occur more often in the tropics than at higher latitudes, and most notably for geopotential height (zg) at 500 hPa and for temperature near the tropopause.  These variables show increases in bias error that are comparable to observational standard deviations, i.e., normalized bias differences ~ 1.  Comparison of Fig. 5(b) with Fig. 5(c) indicates that these results do not depend on the (arbitrary) choice of time segment from the long coupled model control runs.

Comparison of Fig. 6(a) with Fig. 5(a) and Fig. 1(a) indicates that most of the CCM3 simulation errors are dominated by the pattern component rather than the bias.  The main exception to this rule is temperature near the tropopause, for which bias errors and pattern errors are comparable for most latitude bands and seasons.  Increases in the pattern error exhibited by the coupled models (Figs. 6(b)-(d)) are generally larger than increases in the bias error (Figs. 5(b)-(d)).  These increases, however, are mostly less than 30% and often less than 10% of the magnitude of observational standard deviations, except for the PCM's simulation of the tropics in December - January - February.  There is even a suggestion that the coupled model simulations of sea level pressure (psl) and near-tropopause temperature are improved somewhat over the CCM3 prescribed-SST simulation for most latitude bands and seasons.  Comparison of Fig. 6(b) with Fig. 6(c) indicates that these results do not depend on the choice of time segment from the long coupled model control runs.

Space-time pattern errors could arise in a simulated field that is perfectly correlated with observed patterns but contains erroneous variance magnitudes, or vice versa.  Thus, climatological pattern errors may be subdivided into a component due to lack of correlation and a component due to errors in the magnitude of variability.  Because they are interrelated, the three quantities-standard deviation, correlation, and RMS error-may be plotted in a two-dimensional graphic.  Fig. 7 shows such a plot with a subset of the variables included in the RMS error portraits.  The display-with the ratio of simulated to observed standard deviation as the radial coordinate and the arc-cosine of the correlation coefficient as the angular coordinate-is mandated by the equation relating the three quantities (Taylor, 2001).

Among the variables included in the diagram, 500 hPa geopotential heights are simulated best, with standard deviations within 20% of observed values and pattern correlations with observations exceeding 0.99.  Near-surface temperatures are simulated nearly as well, with model-observed correlations exceeding 0.97.  It is noteworthy that these statements apply to the coupled model control runs almost as well as they do to the CCM3 AMIP run.  For example, the CCM3 surface air temperature field's correlation with observation is about 0.99 while those of the CSM and PCM are about 0.98.  For precipitation and surface heat fluxes, however, the CCM3 prescribed-SST simulation clearly outperforms the coupled model runs.  Note also for these variables that observational uncertainty (as indicated by differences between the alternate observationally based data sets given in Table 1) is significantly less than the difference between models and observations.  This implies that the model errors are real and are greater for the coupled models.  A different situation applies to total cloudiness, for which model errors are essentially identical for all three models in terms of this diagram.  Finally, consistent with Fig. 6, coupled model simulations of sea level pressure and temperature near the tropopause are improved somewhat over the CCM3 prescribed-SST simulation.  The coupled models' sea level pressures show improved pattern correlations, and the coupled models' tropopause temperatures show improved standard deviations (although near-tropopause temperatures remain poorly simulated for all the models, especially considering the relatively small observational uncertainty in this quantity).

3.3    Interannual errors

We have noted that coupled model control runs represent long-term climate behavior without reference to particular calendar years.  The space-time pattern correlation between interannual components of coupled model control run output and observations is therefore close to zero, and we expect interannual variations in the CSM and PCM to be substantially more erroneous than in the CCM3 control run.  Fig. 8 shows this is indeed so, but it contains some surprises.  Nearly all parts of Fig. 8(a) show inteannual RMS errors for the CCM3 AMIP run greater than observational standard deviations, in contrast to Figs. 5(a) and 6(a).  It is important to note that the interannual errors shown here are normalized to the interannual component of observational standard deviations.  (Including other components of observational variability in the normalization would produce essentially invisible errors, because interannual variations are small compared with climatological variations.)  The only normalized interannual errors < 1 in Fig. 8(a) occur in the tropics.  There the CCM3, given observed interannual variations of sea surface temperature, is able to produce interannual variations of atmospheric temperature and pressure whose disagreements with observation are relatively small.  The like statement does not apply to other variables and in particular not to the components of energy balance at the top of the atmosphere (rlut and rsut).

Figs. 8(b)-(d) show a general degradation of the simulations when the constraint of observed SST is lifted, most notably in the case of surface air temperature.  Figs. 8(b)-(c), however, indicate a robust apparent improvement in the CSM's simulation of interannual surface latent heat flux (hfls) compared with the CCM3 AMIP run.  This result is explained in Fig. 9, which splits the interannual errors into correlation and variability magnitude components as in Fig. 7.  All coupled model output variables including hfls have approximately zero correlation with the observed year to year sequence of interannual variations.  The CCM3's interannual correlation with observations, while greater than zero, is still fairly low for all variables shown.  For hfls the CCM3 attains a correlation of only 0.2 and also overestimates the magnitude of variability by 30%.  The CSM simulation greatly reduces the variability magnitude of hfls, to about a 20% underestimate.  The net effect is to decrease the RMS error.  This occurs because, when simulations are poorly correlated with observations, RMS error is minimized by reducing model variability to zero.  In short, the "improvement" in the coupled model's interannual hfls implied by RMS error scores simply reflects a limitation of using RMS errors alone to quantify model performance.

Fig. 9 also indicates that observational errors are large for interannual variations.  Correlations between alternate observationally based data sets fall between 0.43 (for clt) and 0.95 (for zg at 500 hPa).  Nevertheless, for all variables shown, the separation between the models and primary observations in the diagram is appreciably larger than the separation between secondary and primary observations.  This implies that interannual model errors are significant compared with observational uncertainty.

3.4 Effects of more realistic climate forcing

Finally, we address the question of how the coupled model control runs' lack of time-varying climate forcing affects our results.  As noted above, we expect control runs to diverge from both AMIP results and late 20th century observations due to both variations in natural climate forcing and anthropogenic effects.  This divergence (together with improved agreement with observations when coupled models include the appropriate additional climate forcing) has become increasingly evident in studies detecting recent climate changes and attributing them to various causes (Mitchell et al., 2001).  At the same time, the traditional practice of evaluating coupled ocean-atmosphere models by comparing their control run output with observations collected over the past few decades has continued (McAvaney et al., 2001).

In the present study, any reduction of coupled model errors that would result from incorporating time-evolving climate forcing would only reinforce the conclusion that coupled model simulations agree surprisingly well with AMIP runs.  Nevertheless, we have used available PCM 20th century simulations to investigate this issue.  Fig. 10 compares the PCM control run output with an ensemble of the model's 20th century simulations (runs B06.57-B06.61), which include changes in well mixed greenhouse gases, direct sulfate aerosol effects (both anthropogenic and volcanic), tropospheric and stratospheric ozone, and solar irradiance (Meehl et al., 2003b).  The diagram shows that both the correlation between simulated and observed data and the simulated / observed ratio of standard deviations are generally identical for the control run and the 20th century simulations.  The only exception is tropopause temperature (ta at 200 hPa), for which the 20th century simulations exhibit (counter-intuitively) slight decreases in correlation and increases in standard deviation.  Small differences between the control run and the 20th century simulations are also evident in the "bias" components of the fields (not shown).  Evidently, substituting 20th century simulations for coupled model control runs would not make an appreciable difference in our results.  In a more general sense, Fig. 10 provides justification for comparing coupled ocean-atmosphere model control runs with present day climate observations, provided the comparisons involve large scale, time averaged climate statistics that are not appreciably affected by changing climate forcing.

4.  Conclusions

In their comparison of a CCM3 AMIP run and the CSM coupled control run, Boville and Hurrell (1998) found that "differences between CCM3 and CSM1 are quite small in most measures of the atmospheric circulation", with two major exceptions.  One exception was found near the poles, where differences in sea ice lead to "substantial temperature differences near the surface".  Such differences would not be easily detected in the graphics presented above, which do not focus on high-latitude areas.  The second exception involves tropical precipitation.  Boville and Hurrell found that in the CSM's December - January - February climatology, "the southern ITCZ [intertropical convergence zone] in the Pacific is much stronger than observational estimates . . . while the South Pacific convergence zone (SPCZ) is notably suppressed compared to both the observations and CCM3".  The CSM thus produces a double ITCZ placed symmetrically about the Equator rather than the asymmetric rainfall pattern that is observed.  This problem is also evident in the PCM (Covey, 2002, Fig. 2) and in coupled models generally (Covey et al., 2003, Fig 4).  In Figs. 6b-d of the present report, the problem appears as an increased RMS pattern error for precipitation (pr) in the December - January - February season in the tropics (20S - 20N latitude).  These diagrams also indicate even greater coupled model precipitation errors in the March - April - May season, which was not examined by Boville and Hurrell.

Despite these degradations in simulation quality on removing the constraint of observed sea surface temperature and sea ice, our overall conclusion matches that of Boville and Hurrell: "Differences between the simulations are remarkably small".  It may be premature to draw sweeping conclusions about the state of the art of climate simulation from just one family of models.  Still, the results presented above and by Boville and Hurrell provide at least an "existence proof" supporting the IPCC's statement quoted earlier.  We have shown that modern coupled ocean-atmosphere models without flux adjustment can provide simulations of comparable quality to those provided by atmosphere GCMs driven by observed SST and sea ice boundary conditions.  Of course qualifications must be attached to this statement.  Most notably, we have not concerned ourselves with the stability of climate simulations for time periods exceeding 300 years.

Our future work will examine other pairs of AMIP and CMIP simulations as they become available.  These will include the new NCAR Community Climate System Model (CCSM), the successor to the CSM.  Preliminary results comparing the CSM, CCSM, PCM and other models indicate that relevant global-scale feedbacks involved with climate model sensitivity are controlled mostly by the atmosphere component, with ocean, sea ice and land surface processes secondary to first approximation (Meehl et al., 2003a).  This inference is consistent with the results presented above.

Finally, we note that the present work is a first attempt to synthesize a large amount of information comparing coupled (CMIP) and un-coupled (AMIP) climate simulations.  How best to make use of a complex set of observational products for this purpose is an area where more work is clearly needed.


We are grateful to the CSM and PCM modelers for contributing their results and to Michael Fiorino and Benjamin D. Santer for useful discussions. This work was performed under auspices of the U.S. Department of Energy by University of California Lawrence Livermore National Laboratory under contract No. W-7405-Eng-48.


Boville, B. A., Gent, P. R., 1998.  The NCAR Climate System Model, version one.  J. Climate 11, 1115-1130.

Boville, B. A., Hurrell, J. W., 1998.  A comparison of the atmospheric circulations simulated by the CCM3 and CSM1.  J. Climate 11, 1327-1341.

Covey, C., 2002.  Model simulations of present and historical climates.  In: Munn, T. (Ed.) Encyclopedia of Global Environmental Change, Vol. 1, John Wiley and Sons, 114-125.

Covey, C., AchutaRao, K. M., Fiorino, M., Gleckler, P. J., Taylor, K. E., Wehner, M. F., 2002.  Intercomparison of climate data sets as a measure of observational uncertainty.  Report No. 69, Program for Climate Model Diagnosis and Intercomparison, Livermore, CA 94550, USA, 38 pp (available on the Web at

Covey, C., AchutaRao, K. M., Cubasch, U., Jones, P., Lambert, S. J., Mann, M. E., Phillips, T. J., Taylor, K. E., 2003.  An overview of results from the Coupled Model Intercomparison Project (CMIP).  Global and Planetary Change, 37, 103-133 (available on the Web at

Gates, W. L., and 15 co-authors, 1999.  An overview of the results of the Atmospheric Model Intercomparison Project (AMIP I). Bull. Amer. Meteor. Soc. 80, 29-55.

Gates, W. L., Henderson-Sellers, A., Boer, G.J., Folland, C. K., Kitoh, A., McAvaney, B. J., Samazzi, F., Smith, N., Weaver. A. J., Zeng, Q.-C., 1996. Climate models-evaluation.  In: Houghton, J. T. (Ed.) Climate Change 1995: The Science of Climate Change, Cambridge Univ. Press, 229-284.

McAvaney, B. J., Covey, C., Joussaume, S., Kattsov, V., Kitoh, A., Ogana, W., Pitman, A. J., Weaver, A. J., Wood, R. A., Zhao, Z.-C., 2001.  Model evaluation.  In: Houghton, J. T. (Ed.) Climate Change 2001: The Scientific Basis, Cambridge Univ. Press, 471-523.

Meehl, G. A., Boer, G. J., Covey, C., Latif, M., Stouffer, R. J., 2000.  The Coupled Model Intercomparison Project (CMIP).  Bull. Amer. Meteor. Soc. 81, 313-318.

Meehl, G. A., Washington, W. M., Arblaster, J. M., 2003a.  Factors affecting climate sensitivity in global coupled climate models.  Paper presented at the American Meteorological Society 83rd Annual Meeting, Long Beach, CA.

Meehl, G. A., Washington, W. M., Wigley, T. M. L., Arblaster, J. M., Dai, A., 2003b.  Solar and greenhouse gas forcing and climate response in the twentieth century  J. Clim. 16, 426-444.

Mitchell, J. F. B., Karoly, D. J., Hegerl, G. C., Zwiers, F. W., Allen, M. R., Marengo, M.  Detection of climate change and attribution of causes.  In: Houghton, J. T. (Ed.) Climate Change 2001: The Scientific Basis, Cambridge Univ. Press, 695-738.

Taylor, K. E., 2001.  Summarizing multiple aspects of model performance in a single diagram.  J. Geophys. Res. 106, 7183-7192.

Washington, W. M., Weatherly, J. W., Meehl, G. A., Semtner, Jr., A. J., Bettge, T. W., Craig, A. P., Strand, Jr., W. G., Arblaster, J., Wayland, V. B., James, R., Zhang, Y., 2000:  Parallel Climate Model (PCM) control and transient simulations.  Climate Dynamics 16, 755-774.



Primary Observations

Secondary Observations

Time Period


total column cloud amount [percent]


NCEP reanalysis

1984 - 1990


surface latent heat flux [W m-2]

NCEP reanalysis

ECMWF reanalysis

1979 - 1993


surface sensible heat flux [W m-2]

NCEP reanalysis

ECMWF reanalysis

1979 - 1993


relative humidity [percent]

NCEP reanalysis

ECMWF reanalysis

1979 - 1993


specific humidity [g kg-1]

NCEP reanalysis

ECMWF reanalysis

1979 - 1993


precipitation [mm d-1]


NCEP reanalysis

1979 - 1993


sea level pressure [hPa]

NCEP reanalysis

ECMWF reanalysis

1979 - 1993


outgoing longwave at TOA [W m-2]



1986 - 1988


clear sky outgoing longwave at TOA [W m-2]


NCEP reanalysis

1986 - 1988


upward solar at TOA [W m-2]


NCEP reanalysis

1986 - 1988


air temperature [K]

NCEP reanalysis

ECMWF reanalysis

1979 - 1993


surface air temperature [K]

IPCC / Jones

NCEP reanalysis

1979 - 1993


east-west surface wind stress [N m-2]

NCEP reanalysis

ECMWF reanalysis

1979 - 1993


north-south surface wind stress [N m-2]

NCEP reanalysis

ECMWF reanalysis

1979 - 1993


east-west wind [m s-1]

NCEP reanalysis

ECMWF reanalysis

1979 - 1993


north-south wind [m s-1]

NCEP reanalysis

ECMWF reanalysis

1979 - 1993


geopotential height [m]

NCEP reanalysis

ECMWF reanalysis

1979 - 1993

Table 1: Atmospheric variables (see Covey et al., 2002, for references to observations)


Fig. 1.   Normalized root-mean-square errors in atmospheric quantities simulated by (a) the NCAR Community Climate Model, version 3, run with sea surface temperature and sea ice prescribed to match the monthly mean of observations, (b) the same atmospheric model coupled with interactive ocean and sea ice models to form the NCAR Climate System Model, and (c) the same atmospheric model coupled with a different set of interactive ocean and sea ice models to form the Parallel Climate Model.  Column labels designate the variables examined (see Table 1 for definitions).  Row labels designate latitude bands and seasons (December - January - February, March - April - May, June - July - August, September - October - November).

Fig. 2.   Climatological mean January temperature in the CCM3 simulation and its errors, averaged over longitude, as a function of altitude (pressure in hPa) and latitude.  The top panel gives the temperature.  The lower panels give simulation errors evaluated using two different sets of observationally based "reanalysis" data, from the U.S. National Centers for Environmental Prediction (middle panel) and from the European Centre for Medium-Range Weather Forecasting (bottom panel).

Fig. 3.   As in Fig. 2 for zonal (eastward) wind.

Fig. 4.   Errors (simulation minus NCEP reanalysis data) in climatological mean January temperature at the 200 hPa pressure level for the CCM3 (top panel), the CSM (middle panel), and the PCM (bottom panel).

Fig. 5.   The area average (or "bias") component of the normalized RMS errors shown in Fig. 1: (a) bias component for the CCM3 simulation, (b) difference between the end of the CSM simulation and the CCM3, (c) difference between the near-end of the CSM simulation and the CCM3, (d) difference between the end of the PCM simulation and the CCM3.

Fig. 6.   As in Fig. 5 for the climatological mean pattern error component of normalized RMS error.

Fig. 7.   The components of climatological mean pattern error: simulated / observed ratio of standard deviations (radial coordinate) and the correlation of simulation and observation (angular coordinate).  Green circles concentric with the "observed" point indicate isolines of RMS model error.  Different symbols identify different variables according to the figure legend at upper right (see Table 1 for definitions).  Different colors identify different models according to the figure legend at lower left.  The "models" include secondary observations (see Table 1) compared with primary observations for a crude measure of observational uncertainty.

Fig. 8.   As in Fig. 5 for the interannual component of normalized RMS error.

Fig. 9.   As in Fig. 7 for interannual error.

Fig. 10.   The components of climatological mean pattern error for the PCM control run and 20th century climate simulations.  Control run error components, shown in black, are identical to the results depicted by red symbols in Fig. 7.  Colored symbols show error components for four 20th century simulations including time-evolving climate forcing from greenhouse gases, aerosols, ozone and the Sun.  The four 20th century simulations differ only in their initial conditions.

[1] Corresponding author.  Tel.: +1-925-422-1828; fax: +1-925-422-7675.

E-mail address:

[2] Present address: National Energy Research Scientific Computing Center, Lawrence Berkeley National Laboratory, Berkeley, CA, USA