Model predicted low-level cloud parameters. Part I: Comparison with observations from the BALTEX Bridge Campaigns

,

The BALTEX Bridge Campaigns (BBC), which were held in the Netherlands in 2001 and 2003 around the Cabauw Experimental Site for Atmospheric Research (CESAR), have provided detailed information on clouds.This paper is an illustration of how these measurements can be used to investigate whether 'state-of-the-art' atmospheric models are capable of adequately representing clouds.Here, we focus on shallow low-level clouds with a substantial amount of liquid water.In situ, ground-based and satellite remote sensing measurements were compared with the output of three non-hydrostatic regional models (Lokal-Modell, LM; Méso-NH; fifth-generation Mesoscale Model, MM5) and two hydrostatic regional climate models (Regional Atmospheric Climate Model version 2, RACMO2; Rossby Centre Atmospheric Model, RCA).For the two selected days, Méso-NH and MM5 reproduce the measured vertical extent of the shallow clouds, but the liquid water content of the clouds is generally overestimated.In LM and the climate models the inversion is too weak and located at a level too close to the surface resulting in an overestimation of the vertical extent of the clouds.A sensitivity integration with RACMO2 shows that the correspondence between model output and measurements can be improved by a doubling of the vertical resolution; this induces an increase in the modelled inversion strength and cloud top pressure.LM and Méso-NH underestimate the lifetime of clouds.A comparison between model output and cloud cover derived from the Moderate Resolution Imaging Spectrometer (MODIS) indicates that this deficiency is not due to advection of too small cloud systems; it is rather due to an overestimation of the variability in the vertical velocity.All models overestimate the specific humidity near the surface and underestimate it at higher atmospheric levels, indicating that the models underestimate the mixing of moisture in the boundary layer.This deficiency is slightly reduced by inclusion of parameterised shallow convection in the non-hydrostatic models, which enhances the mixing of heat and moisture in the boundary layer.

Introduction
In order to correctly calculate the radiative fluxes in atmospheric models, a good representation of clouds is essential.This is relevant for Numerical Weather Prediction (NWP), but even more necessary for the estimate of the climate sensitivity to natural or anthropogenic changes (Houghton et al., 2001), which relies mostly on computed scenarios using atmospheric models.Moreover, clouds are of importance for an adequate representation of the hydrological cycle.Precipitation and cloud liquid water are coupled and models are found incapable of sustaining high amounts of liquid water without producing precipitation (van Meijgaard and Crewell, 2005).Given the relatively frequent occurrence of precipitation, typical for the European climate, a careful treatment of clouds and their liquid water content is essential for adequate predictions of weather and climate.
Clouds are highly variable in time and space, which hampers their treatment in atmospheric models.Attempts have been made to improve the representation of clouds by decreasing the model horizontal grid spacing (Δx).However, to produce accurate fields of precipitation and clouds, improving the resolution needs to go together with improving the assimilation techniques and the scale-dependent parameterisation of physical processes (e.g.Stoelinga et al., 2003).Currently, high-resolution models with a value for Δx of about 3km are under development for NWP at several meteorological institutes (e.g. the German Weather Service, Météo-France and the UK Met Office).Models operating at this resolution are sometimes referred to as Cloud Resolving Models (CRM), although it is clear that their grid spacing is too coarse to resolve the abundance of small clouds.In present-day regional climate simulations, Δx is typically in the order of 20km, at which convection must still be parameterised.Given these recent advances, evaluation of clouds in models with these typical values for Δx is of importance.
CRM, Large Eddy Simulations (LES) and Single Column Models (SCM) are important tools to develop new parameterisations of cloud-related processes for large-scale models, one of the objectives of the Global Energy and Water-cycle Experiment (GEWEX) Cloud System Studies (GCSS) (Randall et al., 2000).In such studies surface turbulent fluxes, radiative heating profiles and moisture and heat tendencies are often prescribed (e.g.Xu et al., 2002Xu et al., , 2005;;Stevens et al., 2005).Therefore, feedbacks between, for example, the land and the atmosphere or the clouds and radiation are ignored.In addition, it is not straightforward to initialise the idealised integrations.For instance, Xu et al. (2002) discussed that the causes for the model deficiencies they identified are related to oversimplification in the initialisation procedure and not due to model shortcomings.Complementary to these idealised CRM and LES simulations, studies taking into account the complicated interactions, described above, are of relevance.In addition it is important that models are tested in the same framework in which they are operated at the meteorological or research institutes, which is never the idealised mode.Our study addresses these points.
Only sparse information on the vertical distribution of hydrometeors has been available from measurements for the modelling community.The cloud and radiation interaction depends on several cloud characteristics, namely the 3-dimensional cloud boundaries and water content.To derive this information, sites with new sophisticated active and passive ground based remote sensing instruments have emerged, like the European meteorological sites (Chilbolton, UK; Palaiseau, France; Lindenberg, Germany; Cabauw, The Netherlands) or the Atmospheric Radiation Measurement (ARM) experiment (Stokes and Schwartz, 1994).Data from the ARM Clouds and Radiation Testbed (CART) domain, located in the Southern Great Plains in the United States of America, have been used to evaluate different type of models (e.g.Williamson et al., 2005).From CART data, Sengupta et al. (2003) demonstrated the importance of accurate Liquid Water Path (LWP) retrieval for calculating radiative fluxes that correspond closely to the observed fluxes.The information of the drop size as given by the effective radius is of minor importance-at least for warm boundary-layer clouds considered in their study.
The Baltic Sea Experiment (BALTEX) Bridge Campaigns (BBC; Crewell et al., 2003Crewell et al., , 2004) ) were held in The Netherlands, around the Cabauw Experimental Site for Atmospheric Research (CESAR) during August andSeptember 2001 andMay 2003.The BBC data set consists of ground-based and airborne in situ and ground-and satellite-based remote sensing measurements.The latter can offer high spatial resolution and high spatial coverage (e.g. the Moderate Resolution Imaging Spectrometer; MODIS), whereas ground based remote sensing offers high temporal resolution.
Two days, with shallow low-level water clouds, were selected from the BBC data set for the World Meteorological Organization (WMO) cloud-modelling workshop, held in Hamburg (Germany) in July 2004.For these two cases, we have investigated the performance of five atmospheric models, namely three non-hydrostatic models using Δx = 3km (Lokal-Modell, LM; Méso-NH; fifth-generation Mesoscale Model, MM5) and two regional climate models using Δx = 18 km (Regional Atmospheric Climate Model version 2, RACMO2; Rossby Centre Atmospheric Model, RCA).This paper is the first paper in a sequence of two and introduces the WMO-BBC cases.
In addition, both spatial information from the MODIS satellite as well as ground-based remote sensing and in situ measurements from CESAR are used to evaluate the models.The second paper (Schröder et al., 2006) is devoted to the evaluation of several cloud parameters derived from satellite remote sensing by introducing new quantitative measures.In addition to cloud cover, also cloud top pressure and cloud optical thickness are considered.These two papers are an illustration of how extensive data sets like BBC can be used for model evaluation.
Previously, information from single remote sensing instruments has been used for model evaluation e.g. the LWP derived from radiances measured by microwave radiometers.It was found that differences between models are much larger for LWP than for other variables like temperature and humidity (Xu et al., 2002).Zhu et al. (2005) found that in SCM simulations for the nocturnal stratocumulus topped marine boundary layer, modelled LWP varies by a factor 10. Also van Meijgaard and Crewell (2005) found large differences in LWP between various models.Curry et al. (2000) found a large underestimation of LWP for Arctic clouds in the European Centre for Medium-range Weather Forecast (ECMWF) model and in the SCM versions of the Colorado State University General Circulation Model and the Arctic Regional Climate System Model.Radar measurements in combination with a lidarceilometer have been used frequently to study the vertical structure of clouds, but with these instruments information on the quantity of cloud water is missing.Willén et al. (2005) found that in the RCA model the cloud base height is underestimated for lower clouds and that the occurrence of clouds from 400 m to 2 km height is overestimated.Guichard et al. (2003) identified a very good correspondence between model clouds and radar observations especially for deeper long-lasting systems in Oklahoma.Shallow clouds were underestimated in their study.Also Jakob et al. (2004) found deficiencies in the representation of shallow cumulus in the ECMWF model.
By combining information from different remote sensing instruments, vertical profiles of meteorological variables can be derived.Here we apply the Integrated Profiling Technique (IPT; Löhnert et al., 2004), which gives optimal estimates of profiles of Liquid Water Content (LWC), specific humidity (q) and temperature (T).The IPT uses information from various sources (cloud radar, microwave radiometer, lidar-ceilometer, radiosonde, ground-level measurements) and provides error estimates for the derived quantities.It is the first time that the IPT-method is used for the purpose of model evaluation.
Two model deficiencies previously identified are studied by sensitivity integrations.It is clear that a grid spacing of 3km is too coarse to resolve the shallow convective clouds.Therefore, we have tested how the implementation of a shallow convection scheme in the non-hydrostatic models affects the representation of low-level clouds.In addition, we have tested the sensitivity of the representation of clouds to the vertical resolution used in the climate models.
The paper is structured as follows: Section 2 describes the CESAR measurements.Section 3 gives an overview of the five models that are used in this study.In Section 4, a description of the cases is given and the models are evaluated by discussing (i) the synoptic situation and the evolution of the wind vector at CESAR, (ii) the spatial distribution of clouds over the Netherlands at MODIS overpass times, (iii) the vertical profiles of the LWC as derived from IPT, (iv) LWP from the microwave radiometer, (v) precipitation from rain gauges and radar, (vi) Integrated Water Vapour (IWV) from the microwave radiometer and (vii) the temperature and humidity profiles from radiosonde ascents.In Section 5, the sensitivity integrations are discussed and in the last section, the results are summarized.

Observational data
Within the framework of BALTEX, two intensive cloud measurement campaigns were held in The Netherlands, namely BBC (August and September 2001) and BBC2 (May 2003).The BBC campaigns were performed around the central experimental facility CESAR at Cabauw (51°58′N, 4°55′E) in The Netherlands.This site is located in a flat and rural region about 50 km south of Amsterdam and is the central measurement facility of the KNMI.Advanced remote sensing instruments were operated at CESAR.Below, a description is given of the measurements that are used in this paper.For a complete overview of all measurements that were made during BBC, we refer to Crewell et al. (2004).
For basic cloud structure measurements the vertically pointing KNMI radar, operating at 35 GHz, was used.This instrument has the capability to identify the vertical structure of clouds in the atmosphere, including the detection of cloud top and number of cloud layers.There is also some limited information on LWC available.However, since the radar reflectivity is proportional to the sixth power of the droplet diameter, the radar backscatter signal is dominated by the reflectance of large droplets.It is therefore difficult to quantify the LWC, in clouds that contain small droplets.In addition, reflectance from non-meteorological targets (e.g.insects, plant debris) or drizzle with negligible LWC below the cloud base can obscure the signal.
Backscatter profiles from lidars are proportional to the square of the droplet diameter and are therefore more sensitive to smaller droplets compared to the radar.In addition, these instruments are less sensitive to drizzle or insects than the radar.The lidar-ceilometer is suitable to detect the cloud base with an accuracy of 30 m.The backscatter profiles from a near-infrared standard Vaisala CT-75K lidar-ceilometer with a temporal resolution of 15 s was used for determining the cloud base.
Emission of microwave radiation by clouds depends on LWC and temperature.Therefore, microwave radiometers are one of the most accurate methods to derive the vertical integral of LWC, namely the Liquid Water Path (LWP) (Westwater, 1978).By comparing atmospheric brightness temperatures at two frequencies, LWP and Integrated Water Vapour (IWV) can be derived simultaneously from the microwave radiometer measurements.In this study, a 'state-of-the-art' microwave profiler with 22 channels was used (the Microwave Radiometer for Cloud Cartography MICCY; Crewell et al., 2001), with a temporal resolution of 1 s.Due to channel optimization LWP accuracy (15 g m − 2 )i s superior to the two channel approach (Crewell and Löhnert, 2003).During rainfall, the radiometer signal is inaccurate due to wetting of the antenna or radome.Therefore, periods with precipitation are excluded from our analysis.
Profiles of wind speed, wind direction, temperature and humidity are available from the CESAR meteorological tower (213 m) and from radiosondes, which were launched at CESAR four times a day (at approximately 3 UTC, 9 UTC, 15 UTC and 21 UTC).For BBC, a RS80 radiosonde was operated at CESAR, using a radiotheodolite system for measuring the components of the wind vector with an accuracy of 1 m s − 1 .For BBC2, a RS90 system with LORAN-C wind finding was used, with an accuracy of 1 m s − 1 (Nash, 1994).The accuracy of the temperature measurements is in the order of 0.5°C.The accuracy of the humidity measurements is estimated to be in the order of 5% (Turner et al., 2003), but larger errors have been reported during cold conditions especially for the RS80 system.
Measurements of radiances by the Moderate Resolution Imaging Spectrometer (MODIS) onboard the Terra satellite are used for retrieval of the cloud cover (b).For a description of how different cloud parameters are derived using MODIS, we refer to Schröder et al. (2006).
Instantaneous model profiles, available every 15 min for the grid box closest to the CESAR site, are compared with the measurements.The model values represent the area of a model grid box, whereas measurements were made at a specific site.We use a method that has been used in many other studies (e.g.Willén et al., 2005, van Meijgaard andCrewell, 2005); we leave the model predictions untouched and average the observations in time.The averaging time is chosen to match the model grid size based on the cloud field propagation speed.A fixed wind speed value of 10 m s − 1 has been applied, which is observed to be a typical value for Cabauw.By using this wind speed, a time scale for averaging of 5min for the non-hydrostatic models (Δx = 3km) and of 30 min for the climate models (Δx =18km) was obtained.For the non-hydrostatic models, this timescale is shorter than the temporal resolution at which the model output was made available.Therefore we average the measurements over a time interval of 15min (30 min) for comparison with the non-hydrostatic (climate) models throughout the paper.

Description of the models
Integrations were performed with five different regional models.The Lokal-Modell (LM) was developed by the Deutscher Wetterdienst (DWD) for operational weather forecasting in Germany (Doms and Schättler, 1999).The Méso-NH was jointly developed by Météo-France and the Centre National de la Recherche Scientifique (CNRS) for research purposes (Lafore et al., 1998).At GKSS Research Centre, the MM5 is used as a meteorological pre-processor for a chemistry transport model (Community Multiscale Air  Hong and Pan (1996) Based on relations by Businger et al. (1971), Mellor and Yamada (1982) First-order closure (Louis, 1979) plus many modifications Cuxart et al. (2000) k 10 3 s1 0 3 s1 0 4 s5 × 1 0 3 s q crit 0.5 g kg − 1 0.5g kg − 1 0g kg − 1 0.5g kg − 1 Given are the horizontal grid spacing (Δx), time step (Δt), time step for calculation of the full radiative fluxes (Δt radiation ), number of points in the horizontal (N horizontal ), number of vertical layers (N vertical ), initial conditions from which the model is started, fields used to drive the model from the lateral boundaries, use of the hydrostatic assumption in the model, method which is used to split or eliminate acoustic waves (split method), information on how the integration is nested, number of points at the lateral boundary that are used for relaxation to the host model (N boundary zone ), prognostic hydrometeors in the model (liquid water (LW), ice water (IW), rain (RA), snow (SN), and graupel (GR)) and information on the parameterisation of deep convection, shallow convection, stratiform processes and turbulence (Lmix is the physical mixing length).For the nonhydrostatic models, the scheme used in the run with parameterised shallow convection scheme (SHC) is given.Note that in the control integrations, shallow convection is assumed to be resolved.The autoconversion (S au ) from cloud water to rain is parameterised using a time constant defining the speed of the conversion mechanism (k) and a threshold above which autoconversion can take place (q crit ).In RCA, S au also depend on the presence of condensation nuclei or aerosols and on the height, so a value for k and q crit cannot be given.To make a direct comparison possible between the models, the values for k and q crit do not include the collection of cloud droplets by raindrops falling through the cloud and the Bergeron-Findeisen mechanism.
Quality, CMAQ).MM5 was developed by Pennsylvania State University and National Center for Atmospheric Research (NCAR).It was first documented by Anthes and Warner (1978) but has undergone many changes since.The model can be operated in many different configurations.The Royal Netherlands Meteorological Institute (KNMI) developed the Regional Atmospheric Climate Model version 2 (RACMO2; Lenderink et al., 2003;de Bruyn and van Meijgaard, 2005) and the Swedish Meteorological and Hydrological Institute (SMHI) developed Rossby Centre Atmospheric Model (RCA; Jones et al., 2004) for regional climate modelling.These latter two models are hydrostatic models.A summary of the characteristics of the models can be found in Table 1.
Model resolution and domain size are tailored to the purpose for which the models are used at the different institutes.The LM, Méso-NH and MM5 (non-hydrostatic models) use a horizontal grid spacing (Δx) of 2.8 km, 2.5 km and 3 km, respectively.The current operational version of LM uses Δx = 7 km, so our study will be one of the first tests with the finer resolution that is planned to become operational in 2years.The prognostic variables in LM are directly forced from the lateral boundaries by LM (Δx = 7km) operational analyses.Méso-NH uses three nested models in two-way interaction and MM5 uses four nested models in one-way interaction.At the lateral boundaries of the outermost domain, the model prognostic variables are forced from ECMWF analyses in Méso-NH and from National Center for Environmental Prediction (NCEP) global analysis in MM5.The high-resolution inner grid of all three models covers an area of about (400 km) 2 and is centred over CESAR.
The two climate models (RACMO2 and RCA) are set up in an identical way.Their model domain (2400 × 2500 km 2 ) covers a large part of Europe and the Atlantic Ocean, using a grid spacing of 19km.Forcing from lateral boundaries and initial conditions are from ECMWF-analyses for RACMO2 and from ERA40 at 2°resolution for RCA.RCA soil moisture is initialised from climatological fields.
The vertical resolution differs largely among the models, especially above 500m (Fig. 1a).In the layer where the shallow clouds exist (between 500 and 2000 m), Méso-NH and MM5 have the highest resolution.Generally (but not everywhere), the nonhydrostatic models have a higher resolution than the climate models.All five models are grid point models with a terrain following vertical coordinate system near the surface and pressure or height levels higher in the atmosphere.
All three non-hydrostatic models use a three-level Eulerian time differencing scheme; the well-known leapfrog method.RACMO2 (RCA) uses a two time level semi-Lagrangian semi-implicit finite difference scheme with 4th (6th) order horizontal diffusion.RACMO2 was developed by porting the physics package of the ECMWF-NWP, release CY23R4, into the forecast component HIRLAM NWP, version 5.0.6.CY23R4 also served as the basis for the ERA40 project.RCA, which is based on the NWP model HIRLAM, uses the same dynamical core as RACMO2.
For the study of low level clouds, the turbulence and cloud parameterisations are particularly relevant.All models use Monin-Obukhov similarity theory, in which profile stability functions are used to relate the surface fluxes to gradients in wind, dry static energy and specific humidity.Most models (LM, Méso-NH, RACMO2 and RCA) base their scheme on Louis (1979) but MM5 is based on Deardorff (1972).The models differ among others in the coefficients that they use in the profile stability functions and model characteristics like the value of the roughness length.
All models use one-dimensional turbulence: Only the vertical turbulent fluxes are taken into account and horizontal turbulent fluxes are neglected.The firstorder closure approach, in which the relation between fluxes and gradients is defined by vertical eddy diffusion coefficients, is used by MM5 and RACMO2.In both models, a non-local term is added to the local gradient that incorporates the contribution of the large-scale eddies to the total flux (Troen and Mahrt, 1986).The non-local term is only included in calculating the flux in the mixed layer.The remaining three models use a one-and-a-half order closure in which a prognostic equation for turbulent kinetic energy together with a diagnostic length scale is used to calculate the vertical eddy diffusion coefficients.The models differ in the formulations for the mixing length; the Méso-NH and RCA formulation is based on Bougeault and Lacarrère (1989) and the LM formulation is based on Blackadar (1962).None of the turbulence schemes described above aim to represent mixing from unresolved clouds.
In the non-hydrostatic models, deep convection is assumed to be resolved.In the control integrations, no parameterisation of shallow clouds is used either, but in Section 5.1, the sensitivity to the inclusion of parameterised shallow convection is tested.The climate models use a convection parameterisation in which clouds are represented by a single pair of entraining/detraining plumes which describe updraught and downdraught processes.RACMO2 is based on Tiedtke (1989) and RCA uses the Kain and Fritsch (1993) convection scheme.This latter scheme assumes that meso-scale circulations are resolved in the model and that only cloud-scale fluxes have to be parameterised.In RCA, the cloud fractions are determined diagnostically from the relative humidity, vertical motion, static stability and convective properties according to Slingo (1987).
The non-hydrostatic models on one hand and the climate models on the other hand differ in their representation of the stratiform cloud processes.In the non-hydrostatic models, most dynamical processes are assumed to be explicitly resolved, whereas cloud microphysical processes are parameterised.In the climate models, all cloud-related processes are considered sub-grid scale and their contributions to the resolved scale are computed from parameterisations.
Since we focus on boundary-layer clouds, we will discuss the warm cloud and rain processes here.
The non-hydrostatic models use an all-or-nothing scheme: A grid cell is considered to be either cloudy (presence of condensed water) or cloud free (no presence of condensed water).The saturation adjustment technique is used: If a grid box becomes supersaturated during a time step, the temperature and the concentration of the water vapour and cloud water are isobarically adjusted to a saturated state, taking the release of latent heat into account.Subsequently, autoconversion from cloud water to rain and accretion of cloud water by raindrops can take place.The models assume a Marshall and Palmer (1948) distribution of raindrops with size and an empirical formula for the terminal fall velocity of raindrops.Although the formalism in all the models is similar (based on Kessler, 1969), different coefficients are used in several equations, for example for the autoconversion from cloud water to rain (Table 1).The models use prognostic variables for five hydrometeors, namely cloud water, cloud ice, rain, snow and graupel (except for LM which does not have a graupel category).
In the climate models, a fractional cloud cover (b)is allowed.Falling hydrometeors are not calculated prognostically, but leave the grid box within the time step of formation, taking into account evaporation in the atmospheric column below formation.In RACMO2, cloud ice is a prognostic variable, but in RCA it is diagnosed as a function of temperature.In RACMO2, clouds are generated when a threshold for relative humidity is exceeded and b is calculated prognostically.RCA relates b diagnostically to the relative humidity, vertical motion, static stability and convective updrafts.RACMO2 uses a constant liquid water mixing ratio for the threshold for autoconversion (q crit )( Table 1), whereas RCA uses a parameterisation of Rasch and Kristjánsson (1998) based upon Chen and Cotton (1987).In RCA, q crit is parameterised as a function of a critical effective radius of the droplets (r eff,crit ), the density of air and the mean cloud droplet concentration, which differs for maritime and continental air and the height above the land surface.The critical radius at which autoconversion begins is 11μm, which gives a q crit over land varying from 2.2 g kg − 1 near the surface to 0.8 g kg − 1 above 3 km and a constant value of 0.8 g kg − 1 over the sea.In Section 4.4, the sensitivity for this parameter will be tested.
In this paper, the setup (horizontal resolution, domain size, forcing from the lateral boundaries) of RACMO2 and RCA are identical.For the remaining models, no effort is made to conform them to each other.Since there are many aspects that affect the representation of clouds i.e. initial fields, forcing from the lateral boundaries, parameterisation of physical processes and model resolution, such an approach is less efficient to identify deficiencies in one of the above-mentioned parts of the model system.However, it is important that the models are tested in the framework in which they are used in the research institutes, and that different parameterisations are tested in interaction with each other.Moreover, when the same deficiencies were found in a group of models, we have tried to generalize the findings.
The models were initialised with their analyses on the previous day at 12:00 UTC and integrated for a period of 36h.The first 12 h were discarded and only the period from 12 to 36 h is considered throughout this paper.

Description of the cases and evaluation of the atmospheric models
From previous studies, it is known that atmospheric models often fail in representing warm low-level water clouds (see Section 1).Therefore, 2 days, with warm low-level clouds and a substantial amount of liquid water, were selected from the BBC data set, namely the 23rd of September 2001 (D1) and the 21st of May 2003 (D2).These days represent two different meteorological situations.On D1, there was a stratocumulus field in the morning that broke up in the early afternoon, with the development of shallow cumuli.No high-level clouds were present over the observation site.Backward trajectories, going from Cabauw over Denmark backwards into Germany, showed that the air was of continental origin and therefore likely to contain an above average value of condensation nuclei or aerosols, which might have affected the cloud microphysical processes.
On D2, a more complex situation occurred with different cloud layers at different heights and a cirrus shield appearing at the site at about 16 UTC, as a precursor of a frontal system.On average the lower layer of cumulus and stratocumulus had a base at about 0.6 km, and the second layer had a base at about 2km height.In addition, there were several strong convective cells, with the cloud top at about 2.5 km overshooting the zero degree isotherm at about 1.6km.The occurrence of convective structures is apparent from the LWP time series (see also Fig. 7).The autocorrelation function (Section 4.4) drops much faster with time delay for D2 than for D1.The correlation time, when the autocorrelation coefficient has decreased to e − 1 ,i s reached after about 1 h for D1 and about 10 min for D2.On D2, strong westerly winds caused a more maritime origin of the air than on D1.

Synoptic situation
From ECMWF operational analyses (Fig. 2), it appears that on D1 the large-scale flow was dominated by a low pressure system situated over northern Spain and a high pressure system over northwestern Scandinavia.These two systems moved slowly eastward during the day.Just northwest of Cabauw, a secondary low was present during the night, developing into a small trough in the morning.This caused a deviation of the flow from the east (which would have been expected without the secondary low) towards the northeast.
From a comparison between model output and radiosonde data at Cabauw (Table 2), it is found that all models are able to represent the wind direction (φ) of the observed northeasterly winds at 8:58 UTC on D1 within 25°.The modelled wind speeds (v) vary substantially between about 4 m s − 1 and 8 m s − 1 , with MM5 and RACMO2 having the lowest wind speed and LM having the highest wind speed.The large variation among the models points to substantial differences in the model representation of turbulent exchange processes of momentum.In the course of the day, the secondary low pressure system North West of Cabauw and the associated trough disappeared and the wind turned from northeast (43°) to southeast (113°).This turning resulted in a decrease in moisture in the atmosphere as identified from the time series of the IWV derived from microwave radiometer.Turning of the wind also occurred in the model integrations, but is underestimated by all models, especially by RCA.The boundary fields are at much  Given are the mean (v and φ), the modelled minus radiosonde value (Δv and Δφ) and root mean square error (v rms and φ rms) over the lowest 700hPa of the atmosphere.coarser resolution for RCA than for the other models, which could explain some errors in the wind fields.
The bias and root mean square error (rms) in the modelled wind direction in comparison to the radiosonde measurements are in the range of 20°to 80°at 14:43 UTC.At this time, v is overestimated by all models up to 2.5m s − 1 .
During the second BBC case (D2), a different large-scale flow pattern prevailed compared to D1, with a low pressure system in between Iceland and the Azores and a high pressure system centred over the Bay of Biscay (Fig. 2b).These systems were steady during the day and induced westerly winds at Cabauw with φ varying between about 280°at 11:21 UTC and 240°at 21:53 UTC (Table 3).The wind speed is observed to be larger on D2 than on D1, which is reproduced by the models.Again, the differences in wind speed between the models are substantial.Around noon (11:21 UTC), the RCA wind speed profile corresponds most closely to the observations with low wind speed near the surface and a maximum wind speed between 930 hPa and 800 hPa.
The variations in wind direction throughout the day were smaller on D2 than on D1.The models represent these steady westerlies well at 11:21 UTC within 15°, except RCA, which overestimates the southerly component of the wind vector.The slight anticlockwise turning at the end of the day is underestimated by all models.At 17:57 UTC, Méso-NH and MM5 show the largest deviation from the observed value.These two models are not directly forced by operational analyses at the lateral boundaries, but instead they use two or three outer domains at coarser horizontal grid spacing.Some model deficiencies, like the underestimation in the turning of the wind and associated underestimation in decrease of IWV on D1, are probably related to the forcing from the lateral boundaries.However, generally the large-scale flow dynamics and contrast in atmospheric circulation regimes between D1 and D2 are well represented by the models.

Cloud cover
The spatial distribution of clouds over the Netherlands is derived from measurements of radiances and reflectivities by the Moderate Resolution Imaging Spectrometer (MODIS) onboard the Terra satellite.On D1, the secondary low pressure system North West of Cabauw caused advection of moist air over the site.Consequently, a stratiform cloud deck was present over the largest part of the Netherlands at MODIS overpass at 10:45 UTC (Fig. 3a).Cabauw was located at the western side of this cloud deck.The southwestern part of the Netherlands was mostly free of clouds with diagonal, south-west towards north-east orientated cloud bands.These bands were aligned with the prevailing wind direction.The ground-based measurements show that clouds were covering Cabauw during the MODIS overpass time (see Sections 4.3 and 4.4).This cloud deck was relatively stable throughout the entire morning, but in the afternoon it broke up and shallow cumulus clouds developed.
LM is able to represent the stratiform cloud deck over the Netherlands to some extent, although the cloud free region is too small and its orientation is tilted towards the east-west direction (Fig. 3b).The model simulates cloudy conditions at Cabauw at 11 UTC, but the LWP of these clouds is very small (20 g m − 2 ) (Section 4.4).In Méso-NH and MM5, more isolated cloud free regions exist compared to MODIS (Fig. 3c and d).In the climate models, on the other hand, the cloud deck is more homogeneous as the MODIS image indicates (Fig. 3e and f), which is not surprising given their horizontal grid spacing.Over eastern Belgium, in the southeastern corner of the domain presented in Fig. 3, a second cloud free area was observed, which is represented by all models that cover this region.
On D2, the MODIS image indicates that at 10:05 UTC the Netherlands was mostly cloud covered with some small cloud-free areas existing (Fig. 4a).At this time, cloudy conditions prevailed in the Cabauw region.A similar structure over land is represented by LM and Méso-NH, but MM5 simulates a more homogeneous cloud cover than the satellite image indicates (Fig. 4bto  d).All non-hydrostatic models simulate cloudy conditions at Cabauw at 10 UTC, with LWP in the range from 27g m − 2 (LM) to 144g m − 2 (MM5).At Cabauw, the climate models simulate a cloud cover of about 0.5 and an LWP in the range from 96g m − 2 (RACMO2) to 241 g m − 2 (RCA).Generally, these models underestimate the cloud cover over land on D2.In MODIS, LM and MM5 a clear distinction between land and sea can be identified.This is not the case in Méso-NH and the climate models.For a more quantitative description of the difference between the MODIS derived cloud properties and the model output, and for an evaluation of the cloud top pressure and the cloud optical thickness, we refer to Schröder et al. (2006).

Liquid water content
A combination of measurements (Section 2), interpreted with the IPT-method (Löhnert et al., 2004), was used to study the time evolution of the LWC profile on D1 (Fig. 5a).The IPT method produces the LWC together with an error estimate.For the period considered, an average error of 20% was found.For larger values of LWC above 0.2 g kg − 1 , the error is smaller namely 10% on average.
The IPT-inferred LWC record contains several interruptions, which are indicated in Fig. 5a  determination of the LWC profile prone to errors.In addition, even though no precipitation was detected at the surface, the Doppler velocity signal from the radar indicated that drizzle or precipitation occurred at higher atmospheric levels, and therefore IPT was not applied.
Later that day, from 11 UTC onwards, interruptions occurred due to (i) the fact that ceilometer measurements were not consistent with the radar measurements (i.e.no cloud base was detected by the ceilometer but LWC was detected by the radar or microwave radiometer or vice   (d, f) are plotted.The lower panels show (g) IPT-derived LWC using an averaging time of 30min together with output of the climate models (h) RACMO2, (i) RCA and (j) RACMO2 employing double vertical resolution.Liquid water below a threshold of 10 − 2 gkg − 1 is ignored.The grey bar in the IPT plots indicates the time when no information was available during more than 90% of the interval considered due to failure of one of the instruments or due to inconsistencies between the different instruments.The black bar in the upper part of the graphs indicates the occurrence of precipitation.For this,useis made of the rain shutter, mounted on the radiometer.In the models, a threshold of 0.4mm h − 1 is used, below which the precipitation is not indicated.versa) or (ii) no measurements were available from the microwave radiometer.For a comparison between IPT and aircraft observations for D1, we refer to Crewell et al. (2004).They showed that the agreement between these two independent methods of deriving LWC is within the range of uncertainty of this highly variable quantity.
Méso-NH and MM5 represent the vertical extent of the shallow cloud layer adequately (Fig. 5c and e).The phasing of clouds is not so well represented; the maximum LWC is not at 06 UTC (as observed), but around 15 UTC for MM5 and strongly intermittent for Méso-NH.LM and Méso-NH, show a patchy structure in Fig. 5b and c, indicating that the lifetime of clouds is underestimated.Moreover, instantaneous in-cloud values of LWC are much larger than observed.
For the comparison with the climate models, IPT is averaged over a longer time period of 30 min (Section 2) as shown in Fig. 5g.The climate models overestimate the vertical extent of the clouds: LWC values larger than 0.1g kg − 1 are calculated up to a height of 2.5 to 3km (Fig. 5h and i).Since these models are designed for the purpose of regional climate simulations, they are typically operated in multi-annual integrations.This is only feasible at horizontal resolutions far coarser than the resolution of the non-hydrostatic models.To compare the climate models with the non-hydrostatic models, the LWC from Méso-NH and MM5 has been aggregated into mean values for an area of 18 km × 18km (Fig. 4d and f).Also at this scale the vertical extent of clouds in the non-hydrostatic models is found to be smaller than in the climate models, indicating that the overestimation of the geometrical thickness of the clouds in the climate model is not due to scale representativeness related to the horizontal grid spacing that is used.Note that LWC is estimated to be detected by the radar for values higher than about roughly 10 − 2 gk g − 1 and therefore modelled LWC below this threshold is ignored.
On D2, a complex situation existed with clouds present at two levels, namely at about 1 km height and at about 2.2 km height (Fig. 6a and e).LM is the only model which is able to clearly represent this two-layer structure in the clouds (Fig. 6b), but also RCA shows a hint of the two-layer structure (Fig. 6g).Méso-NH (Fig. 6c) and MM5 (Fig. 6d) simulate only one single thin cloud layer, located in between the two observed cloud layers.As for D1, Méso-NH and LM show a patchy structure compared to the measurements whereas MM5 shows a homogeneous structure.Also for this day, all climate models overestimate the thickness of the cloud (Fig. 6f  and g).The high-level frontal ice clouds, coming in at the end of the day as a precursor of a frontal system, are represented very well by all models (not shown).Such frontal clouds are to a large extent driven by the forcing from the lateral boundaries.Generally, such clouds are represented better than the clouds which are more locally generated.This agreement in representing frontal ice clouds indicates that the discrepancy in the representation of the shallow clouds, seen earlier, is not likely to be dominated by the forcing from the lateral boundaries.

Liquid water path
The time series for LWP measured by the microwave radiometer were used to look in more detail into the time evolution of the clouds occurring at Cabauw (Fig. 7).Many interruptions were present in the time series of LWC indicated by grey bars in Figs. 5 and 6.The time series of LWP is more continuous: Apart from the periods with precipitation, there were about four 15-min time intervals on D1 during which the instrument was not operating.The most remarkable differences between modelled and measured LWP are (i) the overestimation of instantaneous in-cloud LWP and (ii) the large intermittency in LWP time series in Méso-NH and LM.
Variability in the time series of LWP, as expressed by the normalized standard deviation (σ N ), is overestimated in LM (σ N = 2.6) and MM5 (σ N = 2.7) and underestimated in the climate models (σ N = 0.6) compared to the measurements (σ N = 1.2) on D1 (Table 4).Méso-NH represents σ N adequately.On D2, all models represent σ N within 0.2, except for LM and Méso-NH, which overestimate the variability in LWP.The standard deviation is a good measure for the temporal variability of a time series, but it is not a good measure to analyse the intermittency of the signal.A better measure for this is the autocorrelation function (R a ) as defined by Lumley and Panofsky (1964):   clouds is underestimated in Méso-NH; already at Δt = 1h, the autocorrelation is reduced to a value of 0.08.RACMO2 overestimates the lifetime of the clouds.At separation of 0.5 h, the modelled Ra is overestimated by 50%.At separations of 1.5 h or less, the autocorrelation function in RACMO2 is larger than the measurements indicate.
To obtain a quantitative measure for the intermittency of LWP and the lifetime of clouds, the autocorrelation function was integrated over time separations (Δt)upto 1.25 h, as schematically indicated by the grey area in Fig. 8.We refer to this integral as R i .Méso-NH and LM underestimate the value for R i on D1 (Table 4).A comparison between modelled vertical velocity and LWC in LM (not shown) indicates that the temporal evolution of LWC is closely related to updrafts and downdrafts in the model.The model might alias sub-grid scale convective motions to the resolved scale.A quantitative evaluation of the vertical velocity is unfortunately not possible since no measurements are available.Note that with the radar the fall velocity of the droplets can be measured, but not the vertical velocity of the air.The value for R i in MM5 corresponds closely to the climate models, which have a more realistic representation of the lifetime of clouds for this shallow convection case D1.
On D2, the LWP is observed to be more episodic than on D1, which is related to the fact that there are more convective cells present.This is reflected in a decrease in R i from 34 min for D1 to 15 min for D2.For the nonaggregated time series, the autocorrelation drops to a value of e − 1 at a time scale of 1 h for D1 and a time scale of 10min for D2.Consistent with D1, a clear difference in R i can be identified between LM and Méso-NH on one hand and MM5 and the climate models on the other hand.Méso-NH is the only model underestimating R i .MM5 and the climate models largely overestimate R i .This is clearly reflected in the LWP time series.For example, in MM5 (Fig. 7c) a cloud is developing in the morning and unrealistically large values for the LWP are present for the entire afternoon.
The phasing of the cloud on D1, with maximum LWP at 6 UTC, is not represented by any of the models.Daily mean LWP is considerably overestimated by all models except LM.The latter is due to a compensation of shortcomings in the LM model: occurrence of clouds is underestimated, but when a cloud is present, then the LWP is overestimated.One should realize that the radiometer only measures LWP during dry conditions.During these dry conditions, LWP is smaller than during rainfall.Therefore the radiometer is expected to provide a too low value for LWP during days with some rain.However, even when time periods with precipitation are excluded in the model, LWP is still found overestimated by most models for this day.
On D2, LWP was observed to be largest around 08 UTC, when two cloud layers were present, and around 15 UTC, when only the upper layer remained.The phasing of high values of LWP is not represented by any of the models.The daily mean LWP in Méso-NH corresponds closely to the measurements, the value for LWP in LM is too low and MM5 and the climate models overestimate LWP by factors 2 to 3 (Table 4).The correspondence for the climate models is better on D2 than on D1.On D2 the LWP was about 30% larger than on D1.This result indicates that the climate models are better capable to represent clouds when more LWC is present or when the clouds are geometrically thicker; representing thin clouds is more problematic.
Biases in modelled LWP have been previously related to wrong estimates for the threshold for autoconversion (q crit ) (Xu et al., 2005).In most models, a large overestimation of LWP, especially on D1, was found.Therefore, sensitivity integration with RCA has been performed to study the effect of a change in q crit .The control integration (CTL) is compared with an integration (AUT) in which the value for the critical effective radius of droplets (r eff,crit ) is changed from 11 μmto5μm, the latter being the standard value in the original parameterisation by Rasch and Kristjánsson (1998).Previously, this value had been increased in RCA to remove excessive drizzle.Since the threshold for autoconversion (q crit ) is parameterised as the third moment of r eff,crit (Chen and Cotton, 1987), q crit is reduced by a factor 10 in AUT compared to CTL.Consequently, the onset of precipitation release occurs faster in AUT, resulting in a more intermittent precipitation time series at Cabauw.The daily average precipitation at Cabauw is affected but the sign is different for D1 than for D2.The average precipitation over the 2 days increases by 50% in AUT.The changes in LWP are more consistent between the 2 days, with a decrease of about 50% in AUT compared to CTL.This result shows that changes in the threshold for autoconversion can largely improve LWP, however at the expense of excessive drizzle.Given these results and other sensitivity studies, the standard settings in RCA will be reconsidered to avoid an overestimation of LWP especially for low cloud conditions.

Precipitation
From the rain shutter, mounted on the radiometer, it was found that D1 was a dry day at Cabauw except for the time interval from 00:00 UTC to 01:00 UTC (Fig. 5a).The rain shutter very accurately responds to the occurrence of precipitation at the surface.A total accumulated value of only 0.2 mm/24 h was measured at Cabauw, using a rain gauge with an accuracy of 0.2 mm.Also the synop observations in the Netherlands show that the region around the Cabauw site was almost dry on D1, with less that 1mm precipitation accumulated over the entire day.In the eastern part of the Netherlands, precipitation was largest with values up to 5mm/24 h, mostly occurring during the first 6 h of the day.
For D1, all models represent the precipitation at the beginning of the day.Note that in the climate models, the precipitation does not exceed the threshold of 0.1 mm/15 min, which was set for Figs. 5 and 6.During the time periods that were observed to be dry, light rainfall at the surface was simulated by most of the models.An exception is MM5, which is the only model that, apart from the first 2 h, simulates a dry day.Partly due to the overestimation of drizzle, the models overestimate the accumulated rain on D1 (Table 4).The overestimation of drizzle might be related to the q crit , the threshold above which cloud water is converted into rain (Section 4.4).
Convective activity over the North Sea was detected by the radar, with very few small isolated precipitating areas over land (Fig. 9a).From 20:30 UTC onwards, more precipitating cells enter the Netherlands from the northeast.In LM and Méso-NH, there are more precipitating cells over land than observed by the radar during the entire day until 20:30 UTC.In contrast, the number of cells is underestimated in MM5.In the climate models, convective cells are less easily identified.The maximum precipitation for a grid box mostly does not exceed 1 mm h − 1 , whereas radar observations show cells with precipitation larger than 3mm h − 1 .This effect can be explained by the coarser horizontal resolution of the climate models, which causes a spreading out of the precipitation over a larger area.The large-scale precipitation patterns in RACMO2 and RCA are very similar, with precipitation exceeding 1mm h − 1 over Southern France and the Alpine regions.Synop observations confirm that these numbers are realistic for D1.
On D2, somewhat more precipitation occurred than on D1, but also on this day accumulated values were low.A value of 3.8 mm/24 h was measured with the rain gauge at Cabauw (Table 4), which was representative for the west of the Netherlands.Most of this precipitation occurred during the first 4 h of the day.The accumulated precipitation associated with the front at the end of the day, as indicated in Fig. 6a, brought very little precipitation.
LM simulates the phasing of the precipitation adequately.In MM5, no rain is simulated for this day.In Méso-NH, RACMO2 and RCA some precipitation occurred at Cabauw before 11 UTC.For this day, all models slightly underestimate the total precipitation (Table 4).
Convective precipitating cells were detected by the radar in the morning until 12 UTC (Fig. 9b).During the time interval from 12 to 14 UTC, the cells were mainly present in the eastern part of the Netherlands, and from 14 to 19 UTC the Netherlands was largely free of convective precipitating cells.LM and Méso-NH, adequately represent the precipitation cells over land, with the shift of cells towards the east of the Netherlands.MM5 largely underestimates the occurrence of convective cells.From 19 UTC onwards, the radar indicates that a front entered the Netherlands from the southwest, bringing drizzle in the evening.LM is the only model in which the frontal precipitation in the Netherlands is simulated.No frontal precipitation was found in the high-resolution domain of MM5 and Méso-NH.In the climate models, frontal precipitation was simulated at the end of the day over eastern England and northern France, indicating that the speed at which the frontal system was moving was underestimated.

Integrated water vapour
Vertically integrated water vapour (IWV) is well represented by all models both for D1 and D2, with the model predicted daily average only slightly larger than what was observed (Table 4).The slight overestimation in the models might be explained by the fact that IWV is only measured by the radiometer during dry conditions.Although the vertical integral of water vapour is well represented by the models, the vertical profile is not: From a comparison between model output and the radiosonde ascents it is found that that the specific humidity is overestimated near the surface but underestimated at higher levels (Section 4.7).
During the first half of D1, IWV decreases, which is caused by dry-air advection due to turning of the wind towards the east.The models all underestimate this decrease in IWV, which is reflected in a lower value for the standard deviation (σ N ) compared to the measurements (Table 4).At the end of D2, just before the onset of precipitation, IWV increases.This increase in IWV is underestimated by most models except by LM, which shows an increase in IWV that started around 15 UTC.This result is consistent with the fact that LM is the only model where precipitation is calculated for the evening.The modelled σ N corresponds closely to the observed σ N for this day.
On D2, the atmosphere is colder and contains less water vapour than on D1, but it contains, on average, more liquid water.On D1, the clouds disappear in the afternoon with only some shallow cumulus clouds remaining, with very little LWC.The models are able to represent the contrast in IWV between D1 and D2, but not the contrast in LWP.The modelled daily mean LWP is larger on D1 than on D2 in all models.

Temperature and humidity profiles
To gain insight in the relation between the vertical profile of the clouds and the temperature and humidity profiles in the boundary layer, radiosonde ascents were compared with model output.Both the measured and modelled maximum (minimum) in the derivative of the potential temperature (θ) and specific humidity (q) with respect to pressure were calculated.In the models, this method cannot be used to identify the inversion strength since the gradients in θ and q are a function of the vertical resolution: Between the level at the top of the mixed layer (ML) and the overlying air there must always be at least one level that is a mixture of ML and overlying air (e.g.Grenier and Bretherton, 2001).For a comparison between modelled and measured inversion strength, we therefore consider the difference in θ and q between the ML and the overlying air, rather than taking the absolute value of the derivative of θ and q with respect to pressure.
During the night of D1 (3:46 UTC), the boundarylayer profile was stably stratified, with a constant lapse rate for both θ and q.At 8:58 UTC, an ML had developed below 930hPa, capped by a temperature inversion (Fig. 10).The value for θ and q in the ML were about constant (284.9K and 6.7g kg − 1 , respectively).Above the ML, the air is dryer and θ is higher, with a secondary small θ and q inversion at 730 hPa.During the course of the day, the ML grew with the pressure of the inversion (P inv ) decreasing from 930 hPa (8:58 UTC) to 870hPa (14:43 UTC).In addition, the ML temperature increased by 2.5K.
In the LM, the inversion at 8:58 UTC is too weak and too close to the surface (P inv is 970 hPa).This deficiency might be responsible for the overestimation of the strength of the resolved-scale updrafts and consequently the overestimation of the cloud top height in this model.The ML growth, with P inv decreasing to a value of 890hPa at 14:43 UTC, corresponds closely to the measured growth, but the strength of the modelled inversion is still slightly underestimated.The LM is the only model that represents the warming of the ML by 2.5K but the tendency in q between 700hPa and the surface is found positive in the model (2.8 g kg − 1 day − 1 ), whereas it is observed to be slightly negative (− 0.8g kg − 1 day − 1 ).In Méso-NH, the depth of the ML is steady throughout the day, which is unrealistic.The inversion is located at about 880 hPa, which is correct for 14:43 UTC, but too high for the early morning.This is consistent with the relatively steady evolution of cloud top and cloud base.The modelled ML warms by about 1K (instead of 2.5 K) and becomes 0.7g kg − 1 wetter over the 6-h period.In MM5, the growth of the ML is about double to what was observed: P inv decreases from 920hPa at 8:58 UTC to 820 hPa at 14:43 UTC.This model underestimates the ML warming but it is the only model that represents the ML drying, which was observed.The humidity tendency corresponds closely to the observed tendency although the whole profile is too wet by about 1.6 g kg − 1 .
In RACMO2, the inversion in both θ and q is too weak and too close to the surface (Fig. 11).During the day, the ML grows and warms, but a clear inversion cannot be distinguished anymore at 14:43 UTC.The RCA model overestimates the static stability during the night (3:46 UTC), a common problem in atmospheric models (e.g.Cuxart et al., 2006).This deficiency in RCA might affect the diurnal variation of the boundary layer structure like the overestimation P inv throughout the day.RCA represents the strength of the inversion at 8:58 UTC well, but it underestimates the growth of the ML during the day.It is likely that the tendency of the climate models to underestimate the strength and the height of the inversion is partly responsible for the overestimation of the cloud top height, which is described in Section 4.3.This result is consistent with Zhu et al. (2005), who found that SCMs are not able to simulate the observed sharp inversion, and as a result underpredict LWC but overpredict cloud thickness.
The situation on D2 is more complex than on D1; the ML evolution throughout the day is less pronounced than on D1.On D2, convective cells were present with cloud tops up to about 700 hPa.In addition, the cloud cover, derived from MODIS (Fig. 4a), is observed to be less homogeneous on D2 than on D1.A two-layer structure can be identified in the θ and q profiles with the top of the upper layer at about 760 hPa and a lower ML extending from the surface to 940 hPa (Fig. 12).Observations of LWC indicate that there was a twolayer cloud structure in the morning of D2.
LM exhibits the two-layer structure in θ, q, and LWC, but again the strength of the lower inversion is underestimated.In Méso-NH, the strength of the inversion is in good agreement with the measurements, but P inv is located in-between the two observed levels namely at 850 hPa.As for D1, P inv and the level of cloud top and cloud base are steady throughout the day.In MM5, the strength of the inversion is largely underestimated, and the atmosphere below 800 hPa cools during the day, which is not in correspondence with the measurements (not shown).This cooling is probably related to the thick homogeneous clouds that exist in MM5 during the entire afternoon, showing the need for a correct cloud representation in order to get the correct temperature forecast.The regional climate models (RACMO2 and RCA), fail to represent the upper temperature inversion at 11:21 UTC.However, at 17:57, the upper temperature inversion is present at about 800-850hPa, which is only slightly closer to the surface than the measurements indicate.At this time, the inversion strength is well represented in RACMO2 and RCA keeping in mind that one level must exist with a mixture of ML and overlying air.As for D1, the vertical gradient in q is too large in all models, with too high values near the surface and too low values higher in the atmosphere.This is most pronounced for Méso-NH, which is capable of representing the strength of the upper temperature inversion, but overestimates the gradient in q.
Boundary layer characteristics are affected by the availability of soil moisture.Since the soil initialisation varies between the models, differences in the model representation of the boundary-layer structure and the clouds can be related to differences in the soil moisture initialisations.For example, RCA initialises the soil moisture from climatological fields, whereas in RACMO2 it is initialised from ECMWF analyses.This is possible since RACMO2 utilizes the same surface scheme as the ECMWF model.The differences in the soil moisture between the models affect the distribution of the net radiative fluxes into sensible (SH) and latent (LA) heat flux.This is reflected in the Bowen ratio, defined as the ratio of SH to LA.For D1, MM5 has the largest value for the Bowen ratio (0.96), indicating that the initialisation of the soil is probably dry compared to the other models (Table 4).This is consistent with the fact that it is the only model in which the ML dries throughout D1.In addition, the inversion strength in MM5 is weak compared to the other non-hydrostatic models.On both D1 and D2, MM5, and the LM have higher values for the Bowen ratio than Méso-NH and RCA.It is clear that the soil moisture cannot be solely responsible for the difference between the models.For example, Méso-NH and RCA, have similar values for the Bowen ratio on D1, but a very different representation of the clouds.Differences in the Bowen ratio are smaller for D2 than for D1.

Shallow convection
Clearly, a grid spacing of 3 km is too coarse to resolve the shallow convective clouds.Indeed from our results it is obvious that deficiencies exist in representing shallow clouds in the non-hydrostatic models.The models might overestimate the updrafts by aliasing sub-grid scale convective motions to the resolved scale and consequently they overestimate in-cloud LWC.For this reason, we have tested how the implementation of a shallow convection scheme in LM, Méso-NH and MM5 affects the representation of low level clouds.The runs excluding (including) parameterised shallow convection will be referred to as CTL (SHC).
We have implemented the shallow convection schemes in an identical way as the institutes where the models are operated.In LM and Méso-NH, the tendencies are only updated when the convection is shallower than a threshold (P SHC ), whereas in MM5 the criterion for definition of shallow convection is based on temperature change and cloud base mass flux.In all models, shallow cumuli are not allowed to precipitate.In LM, the shallow convection scheme is identical to the deep convection scheme of Tiedtke (1989), but only when P SHC is less than 250 hPa, the tendencies are updated.When the convection is deeper than P SHC , the vertical velocity related to the convective system is assumed to be resolved and convective tendencies from the scheme are ignored.Note that DWD is developing a more sophisticated scheme allowing determination of entrainment and detrainment as diagnostic quantities in terms of Convective Available Potential Energy (CAPE).
The convection scheme in Méso-NH (Bechtold et al., 2001) is based on a mass-flux formulation using a bulk cloud ensemble model with a trigger procedure that determines the occurrence and the type of convection.The cloud criterion for assuming shallow convection is a cloud thickness between 500m and 3000m, which corresponds closely to the values used in LM.After the cloud is identified as shallow convection, the updraught and downdraught properties and mass fluxes are computed, and finally the convective tendencies are adjusted following a closure assumption based on CAPE to control the intensity of convection.
In the MM5 integrations discussed here, the shallow convection scheme is based on the cloud work function concept, introduced by Arakawa and Schubert (1974) and modified by Grell et al. (1993).The cloud work function describes the generation of kinetic energy inside the cloud as determined by the updraft and downdraft moist static energy.In the shallow convection case, entrainment is assumed to be equally strong as detrainment, therefore no convective scale downdrafts exist and the cloud work function describes the buoyancy which is available for the particular cloud.If the calculated temperature change and cloud base mass flux are within a predefined window, the shallow convection contributes to the temperature and moisture tendencies within the grid cell.Details can be found in Grell et al. (1995) and Haagenson et al. (1994).
Parameterised convection introduces a mechanism to transport heat and moisture to higher atmospheric levels without the necessity of resolved-scale vertical motion.Therefore, the inversion is expected to be at higher elevation and θ and q are expected to be more homogeneous in SHC than in CTL.This effect is seen best in MM5 (Fig. 13a).On D1 at 8:58 UTC, the pressure of the θ-inversion (P θ i ) decreases from 920 hPa in CTL to 820 hPa in SHC.During the time interval from 8:58 UTC to 14:43 UTC, P θ i decreases from 920 hPa to 820 hPa in CTL.In SHC, P θ i is about stationary (P θ i is 840 hPa at 14:43 UTC), which is not in correspondence with the measurements.
Fig. 13b gives an overview of the changes in P θ i and the pressure of the q inversion (P qi ) in all three models for D1.Changes in P θ i and P qi are larger for MM5 than for LM and Méso-NH.In Méso-NH, the changes in θ and q profiles are remarkably small; this model exhibits the smallest sensitivity to the implementation of the SHC scheme.Consistent with MM5, also the LM distribution of θ and q under the inversion is more homogeneous when SHC is switched on.This causes a weakening of the resolved-scale updrafts and a decrease in their frequency, which in turn leads to a decrease in the number of clouds and the LWC within the clouds (Fig. 13c).For D1, the implementation of SHC does not affect the lifetime of clouds and the implementation of SHC is not solving this problem in LM (Table 4).
The LWP decreases in all models when SHC is switched on (Table 4), which is related to the weakening of the resolved-scale updrafts in the SHC-integrations.On D1, the daily mean LWP decreases by 50% in Méso-NH, 40% in MM5 and as much as 80% in LM.On D2, the criterions for shallow convection are not met in LM and Méso-NH, and therefore the convective tendencies are ignored.MM5 is the only model where the SHC scheme shows a relevant impact and a large decrease in LWP of 70% is simulated.This large sensitivity in all models indicates that, for LWP, the turbulent transport in the models is a crucial factor, a result consistent with the findings of Zhu et al. (2005).Other characteristics of the LWP time series like σ N and R i are hardly affected.The differences in σ N and R i are larger among the models than the difference between the CTL and the SHC version of one model.
Boundary-layer profiles of θ and q are also found to be sensitive to the implementation of the SHC scheme.Apart from changes in the inversion height, the ML is warmer during daytime when SHC is switched on (Fig. 13a) due to an increase in sensible heat flux.In all models, the net radiative flux at the surface increases (on average by 13 W m − 2 ) during daytime in SHC due to a decrease in occurrence of clouds and an optical thinning of the clouds.This excess in net radiative heat flux is divided over the sensible heat flux, latent heat flux, and heat flux into the soil.The gradient in q decreases in SHC and is in better correspondence with the measurements than in CTL.An increase in humidity is found in the layer above about 800 hPa in MM5 and LM, but the ML is found generally dryer in SHC than in CTL.In general, model output is sensitive to the implementation of SHC, but except for the q profile a clear improvement in the behaviour of any of the three non-hydrostatic models is not found when SHC is implemented for these two shallow convection cases.

Enhanced vertical resolution
In Section 4.3, it has been shown that the climate models overestimate the thickness of the clouds and that this overestimation is not caused by the horizontal grid spacing used in the models.The overestimation of the cloud thickness might be related to their relatively coarse vertical resolution.To better understand the model behaviour, a sensitivity run with the RACMO2 model was performed, in which the number of levels is multiplied by factors 1.5 and 2. Both the 60-level integration (RACMO2-L60) and the 80 level integration (RACMO2-L80) have a better vertical resolution than Méso-NH at all heights (Fig. 1b).In the altitude range where the shallow convective clouds occur (0.5-3km), the grid spacing in RACMO2-L80 is as fine as 60 to 170m.
The vertical distributions of θ and q are largely affected by the implementation of additional levels (Fig. 14).The inversion strength and the inversion height increase and are in better correspondence with the radiosonde measurements.At 8:58 UTC on D1, P θ i decreases from 970 hPa in RACMO2-L40 to 940 hPa in RACMO2-L80, which is very close to the observed value (930 hPa).Also at 14:43, a clear inversion exists in both RACMO2-L60 and RACMO2-L80 at 830 hPa (observed value 870hPa), which is absent in RACMO2-L40.The q-profile at 8:58 UTC is hardly affected, but at 14:43, the wet bias below 900hPa decreases from 1.4 g kg − 1 in RACMO2-L40 to 1.1 g kg − 1 (0.9 g kg − 1 )i n RACMO2-L60 (RACMO2-L80).At D2, the profiles are hardly affected at 11:21 UTC, but at 17:57 UTC, the inversion at 820 hPa is better resolved in the highresolution integration.The value for P θ i does not change on D2 by implementing more levels.
The implementation of more levels clearly improves the correspondence between modelled and observed time evolution of the vertical distribution of LWC (Figs. 5j and 6h).The correspondence between RACMO2-L80 and the IPT-derived LWC on D1 is good.The model reproduces the vertical extent of the clouds and the cloud base moving up in the afternoon.Also on D2, the vertical extent of the clouds is smaller in RACMO2-L80 compared to RACMO2-L40, and a hint of the two-layer structure is present.This two-layer structure is even more clearly reflected in the θ-profiles and q-profiles of this day at 17:57 UTC.The vertical distribution of LWC in RACMO2-L60 lays in-between the RACMO2-L40 and RACMO2-L80 results.
The sensitivity of the time evolution of LWP to the implementation of additional levels in RACMO2 is small, except for the time period from 0 to 7 UTC on D2.On D1, LWP increases by only 1% (3%) in RACMO2-L 6 0( R A C M O 2 -L 8 0 )c o m p a r e dt oR A C M O 2 -L 4 0 .During the time period 7-24 UTC on D2, the sensitivity is modest with LWP increasing by 7% (18%) in RACMO2-L60 (RACMO2-L80) compared to RACMO2-L40.It can therefore be concluded that for these shallow convective cloud days, implementation of additional vertical levels improves the representation of the cloud top height, decreases the geometrical cloud thickness and increases the in-cloud LWC, keeping the vertical integral of LWC about constant.The correspondence with measurements clearly improves.

Conclusions and future work
The representation of low-level shallow clouds in five regional atmospheric models has been tested using detailed measurements from the BALTEX Bridge Campaign data set.Two days with shallow low-level cloud conditions and a substantial amount of liquid water were selected from the database, namely 23 September 2001 (D1) and 21 May 2003 (D2).The largescale flow dynamics and contrast in atmospheric circulation regimes between these 2days is well represented by the models.
For the two cases discussed in this paper, the Méso-NH and MM5 model represent the vertical extent of the shallow clouds in correspondence with the measurements.In LM, a few convective cells reach too high levels on D1.The climate models generally overestimate the cloud thickness.In LM, RACMO2 and RCA, the inversion is too weak, too close to the surface or develops too late during the day.This result indicates that the presence of a well-defined inversion is relevant to restrict the vertical extent of the boundary-layer clouds.A sensitivity study with RACMO2 shows that the height and strength of the inversion increase, the cloud depth decreases and the correspondence with the measurements improves when the vertical resolution is doubled.
A two-layer cloud system was observed on D2.LM reproduces this two-layer structure in specific humidity (q), potential temperature (θ) and liquid water content (LWC).The remaining models fail to represent the complex twolayer structure in q, θ and LWC, which is obviously very difficult to represent in atmospheric models.
Méso-NH and LM underestimate the lifetime of clouds (t clouds ), which is reflected in a patchy structure in the time versus height plots of LWC and a largely intermittent Liquid Water Path (LWP) time series.The latter is reflected in an underestimation of the modelled autocorrelation of the LWP time series.The climate models and MM5 overestimate t clouds .There can be two causes for a bias in t clouds .Firstly, when a cloud is considered frozen as it is advected over a site, a misrepresentation of its size or its speed can cause a bias in t clouds at the site.Secondly, when the variability in the vertical velocity and updraft activity is overestimated, clouds can be generated and dissolved on too short time scales (dynamical effect).From a comparison between model output and cloud cover derived from the Moderate Resolution Imaging Spectrometer (MODIS), it was found that for all models, the frequency of occurrence of small clouds and the patchiness is underestimated (see also Schröder et al., 2006), which is consistent with Bryan et al. (2003).For frozen clouds, an overestimation of the cloud size would lead to an overestimation of t clouds at the site over which the cloud passes.The underestimation of t clouds , which was found in Méso-NH and LM, is therefore not due to advection of too small cloud systems; it is rather due to an overestimation of the variability in the vertical velocity.An overestimation of the horizontal wind speed up to 30% might also contribute to the underestimation of t clouds .
On D2, the atmosphere is colder and contains less water vapour than on D1, but it contains, on average, more liquid water.The models are able to represent the contrast in IWV between D1 and D2, but not the contrast in LWP.A sensitivity integration with RCA using a smaller value for the threshold for autoconversion from cloud water to rain (q crit ) shows that changes in this parameter can largely improve mean LWP, which is overestimated in most models.This improvement takes place, however, at the expense of excessive drizzle.
Mixing of moisture to higher atmospheric levels is underestimated in all models, which is reflected in an overestimation of the vertical gradient in the specific humidity.Since this bias occurs in all models, it is likely that the turbulent schemes are not active enough in transporting the moisture upwards during these shallow cloud conditions.The introduction of a shallow convection scheme in the non-hydrostatic models enhances the mixing of heat and moisture in the boundary layer.As a result, the explicitly resolved updrafts weaken and the condensation of water vapour and the LWP decrease.The temporal variability and lifetime of clouds are hardly affected.Apart from a slight improvement in q-profiles, a clear improvement in the behaviour of any of the three non-hydrostatic models is not found when SHC is implemented for these two shallow convection cases.
Our work confirms that difficulties remain in representing shallow low-level water clouds in atmospheric models.The evaluation has pointed to weaknesses in the representation of shallow clouds, which need to be investigated in future for example: (i) the relation between the temperature inversion strength and the vertical extent of clouds, (ii) the life time of clouds, and (iii) the horizontal scale of the cloud systems.
This paper has provided an overview of the BBC cases-one of the cases of the World Meteorological Organization cloud modelling workshop 2004.The description in this paper of the synoptic situation and temporal evolution and spatial variability of shallow clouds, focusing on the CESAR site, is complemented by a detailed comparison between several cloud parameters, observed from space and modelled by the same regional models as discussed here (Schröder et al., 2006).We have given an overview of the data set and an example of how this data set can be used for model evaluation.This paper can therefore serve as a reference for using the BBC data set for evaluation of atmospheric models.
For an atmospheric model it is not feasible (nor relevant) to forecast an individual cumulus cloud at the exact location and time.However, the model needs to be able to describe the statistical properties of the field.Using this approach of model evaluation by comparing measurements of two specific days, no systematic biases in the models can be identified.A long-term evaluation (several months) is desirable to identify possible systematic deficiencies.This study is a first step into the direction of long-term evaluation of atmospheric models using extensive remote sensing data of cloud properties and vertical structure from observational campaigns like BBC.

Fig. 2 .
Fig. 2. Mean sea level pressure in the operational analyses of ECMWF at 12 UTC for (a) 23 September 2001 and (b) 21 May 2003.The cross indicates the site Cabauw.
Fig. 3. Cloud cover over the Netherlands on 23 September 2001 (a) as retrieved from the MODIS overpass at 10:45 UTC and simulated at 11:00 UTC by (b) LM, (c) Méso-NH, (d) MM5, (e) RACMO2 and (g) RCA.The location of Cabauw is indicated by the arrowhead.

Fig. 4 .
Fig. 4. Cloud cover over the Netherlands on 21 May 2003 (a) as retrieved from the MODIS overpass at 10:05 UTC and simulated at 10:00 UTC by (b) LM, (c) Méso-NH, (d) MM5, (e) RACMO2 and (f) RCA.The location of Cabauw is indicated by the arrowhead.

Fig. 5 .
Fig. 5. Liquid water content (g kg − 1 ) in the lowest 5km of the atmosphere on 23 September 2001 (a) as derived from measurements with the Integrated Profiling Technique (IPT) using an averaging time of 15min, together with output of the non-hydrostatic models (b) LM, (c, d) Méso-NH, and (e, f) MM5.For Méso-NH and MM5 both the model values at the grid box closest to Cabauw (c, e), and the average over an area of 18km × 18km surrounding Cabauw(d, f) are plotted.The lower panels show (g) IPT-derived LWC using an averaging time of 30min together with output of the climate models (h) RACMO2, (i) RCA and (j) RACMO2 employing double vertical resolution.Liquid water below a threshold of 10 − 2 gkg − 1 is ignored.The grey bar in the IPT plots indicates the time when no information was available during more than 90% of the interval considered due to failure of one of the instruments or due to inconsistencies between the different instruments.The black bar in the upper part of the graphs indicates the occurrence of precipitation.For this,useis made of the rain shutter, mounted on the radiometer.In the models, a threshold of 0.4mm h − 1 is used, below which the precipitation is not indicated.
using a time separation of Δt and LWP 2 PPPPP P is the variance.As an example, R a is shown as a function of Δt for D1 for one of the non-hydrostatic models (Méso-NH) and for one of the climate models (RACMO2) together with the observations from the radiometer (Fig.8).Note that the aggregation time (15 min or 30min), which is used to transform the observed time series to grid box mean values, has a minor effect on the autocorrelation function.The time separation (Δt) at which the signal goes to zero is an indication for the time period for which clouds typically exist and thus an indication for the lifetime of clouds.It is clear that the lifetime of the

Fig. 8 .
Fig.8.The autocorrelation function (R a ), defined as the autocovariance using a time separation of Δt divided by the variance.R a is plotted as a function of Δt for the observed LWP time series using an aggregation time of 15 min (thick solid line), Méso-NH (thin solid line), the observed LWP time series using an aggregation time of 30min (thick dashed line) and RACMO2 (thin dashed line).The grey area shows the integral of R a over time separations Δt up to 1.25h, which is introduced as a measure for the intermittency of LWP and the lifetime of clouds.This integral is referred to as R i .

Fig. 13
Fig. 13.(a) Profiles of the potential temperature (θ) on 23 September 2001 at 8:58 UTC (solid lines) and 14:43 UTC (dashed lines), calculated using the control MM5 model (CTL; thick lines) and the MM5 model in which shallow convection was parameterised (SHC; thin lines).(b) The pressure of the temperature inversion at 8:58 UTC (□) and at 14:43 UTC (◊) and the pressure of the humidity inversion at 8:58 UTC (+) and at 14:43 UTC (×) for the non-hydrostatic models with and without the SHC implemented.(c) As Fig. 5b but with SHC implemented in LM.

Table 2
Comparison of radiosonde profiles of wind speed (v) and wind direction (φ) with model output for 8:58 UTC and 14:43 UTC on 23 September 2001

Table 4
Statistics of the Liquid Water Path (LWP), the accumulated precipitation over 24h (P), the Integrated Water Vapour (IWV) and the Bowen ratio (γ) Given are the mean value, the standard deviation normalized by the mean (σ N ) and the integral of the autocorrelation function over time separations Δt up to 1.25h (R i ).The values in bold refer to the integrations with the shallow convection scheme switched on.For γ, both the mean value over the time period during which observations where available (first value) and the daily mean value (second value) are given for 21 May 2003.See text for further explanation.