Bias correction of dynamically downscaled precipitation to compute soil water deficit for explaining year-to-year variation of tree growth over northeastern France

This paper documents the accuracy of a post-correction method applied to precipitation regionalized by the Weather Research and Forecasting (WRF) Regional Climate Model (RCM) for improving simulated rainfall and feeding impact studies. The WRF simulation covers Burgundy (northeastern France) at a 8-km resolution and over a 20-year long period (1989–2008). Previous results show a strong deficiency of the WRF model for simulating precipitation, especially when convective processes are involved. In order to reduce such biases, a Quantile Mapping (QM) method is applied to WRF-simulated precipitation using the mesoscale atmospheric analyses system SAFRAN («Système d’Analyse Fournissant des Renseignements Adaptés à la Nivologie») that provides precipitation data at an 8 km resolution. Raw and post-corrected model outputs are next used to compute the soil water balance of 30 Douglas-fir and 57 common Beech stands across Burgundy, for which radial growth data are available. Results show that the QM method succeeds at reducing the model’s wet biases in spring and summer. Significant improvements are also noted for rainfall seasonality and interannual variability, as well as its spatial distribution. Based on both raw and post-corrected rainfall time series, a Soil Water Deficit Index (SWDI) is next computed as the sum of the daily deviations between the relative extractible water and a critical value of 40% below which the low soil water content induce stomatal regulation. Post-correcting WRF precipitation does not significantly improve the simulation of the SWDI upon the raw (uncorrected) model outputs. Two characteristic years were diagnosed to explain this unexpected lack of improvement. Although the QM method allows producing realistic precipitation amounts, it does not correct the timing errors produced by the climate model, which is yet a major issue to obtain reliable estimators of local-scale bioclimatic conditions for impact studies. A realistic temporality of simulated precipitation is thus required before using any systematic post-correction method for appropriate climate impact assessment over temperate forests. © 2016 Elsevier B.V. All rights reserved.


Introduction
Numerous studies highlighted that extreme or recurrent drought events, especially when combined to a heat wave, constitute one of the main climate hazard causing dieback in temperate symptoms such as abnormal coloration and needle loss as direct responses, and delayed symptoms like abnormal tree mortality (Sergent et al., 2012). Altered coloration of needles was caused by warm surface temperatures, occurring after a long period of reduced tree transpiration due to stomatal closure, induced by a very early soil water deficit (Bréda et al., 2006). A reduction of the net primary productivity and radial growth of common Beech trees was also reported after the summer 2003 heat wave over northeastern France (Granier et al., 2007).
In the context of anthropogenic climate change, an increase of the exposure to soil water deficit is expected in the Northern Hemisphere, especially in spring and summer, that is, the main vegetative period (IPCC, 2007). Over northeastern France, climate projections using the Weather Research and Forecasting (WRF) regional climate model and the Intergovernmental Panel on Climate Change (IPCC) A2-Special Report on Emissions Scenario (SRES-A2 - Meehl et al., 2007) have shown a warming of up to 3 • C for 2030-2050 and 5 • C by the end of the century (Xu et al., 2012). Over Burgundy, the mean annual temperature for the year 2003 was about 3 • C higher than the 1970-2000 climatology (Xu et al., 2012). Thus, the 2003 episode can be seen as an extreme event under current climate, and is an example of possible forthcoming forest-impacting hazards in a near future (Bréda et al., 2006;Betsch et al., 2010). Regional adaptation to global change through mitigation and adaptation of sylvicultural strategies is a key issue for the agro-forestry community. Future drought episodes inducing recurrent diebacks may reduce forest productivity and affect species distribution (Parmesan and Yohe, 2003;Mueller et al., 2005).
As a region extensively covered by forests (over 30% of the land area), Burgundy is particularly sensitive to such environmental changes. Besides oaks, Douglas-fir and common Beech are two highly represented species. The radial growth of these species is highly driven by soil water deficit (Lebourgeois et al., 2005;Betsch et al., 2010;Sergent et al., 2012), indicative of the intensity of drought episodes. Future climate changes are therefore a potential threat for the productivity of these forest stands.
During the last decade, numerous studies attempted to document the future of French forests potential distribution, productivity and risk of dieback. But these depend on ecological parameters such as soil water availability, which show large variations at both the local and the regional scales . Impact models often need input climatic data at a resolution finer than 10 km and require thus a downscaling step of the climate information provided by the global climate models (Boé et al., 2007). Previous works proposed projections of forest potential areas using niche-based models (Pyatt et al., 2001;Piedallu et al., 2009;Badeau et al., 2010) based on statistical downscaling of climate data (Hewitson and Crane, 2006;Solomon et al., 2007;Christensen et al., 2007;Ning et al., 2012a,b). Such statistical downscaling considers empirical relationships linking the large-scale atmospheric variables to local or regional-scale variability and do not take into account auto-correlation between climatic variables (von Storch and Zwiers, 1999;Wilks, 2006).
In this study, we propose to use a daily lumped water balance model fed with climate information derived from a RCM, in order to calculate soil water deficit and its relationship to tree growth. Physiological indices, such as the Soil Water Deficit Index (SWDI) are used rather than climatic data to allow biological interpretations of the growth response to climate. These indices could be better correlated with radial growth than monthly climatic variables because soil water availability is a major limiting factor for tree growth (Michelot et al., 2012). Furthermore, using impact models as an additional source of information for evaluating climate models or post-correction methods can provide many benefits, but the use of impact indicators to test them is infrequent (Stéfanon et al., 2015;Boulard et al., 2015).
Among the process-based models, the daily water balance model Biljou © (Granier et al., 1999) is dedicated to forest stands, and allows calculating elementary water fluxes (tree transpiration, understorey evapotranspiration, rainfall interception, drainage) as well as the daily soil water content under forests. The model quantifies the intensity, the duration and the starting date of drought experienced by the stand. It has already been successfully applied to different stands (Bréda et al., 2006;Gandois et al., 2010;Van der Heijden et al., 2011;Michelot et al., 2012) and over parts of Burgundy (Sergent et al., 2012;Van der Heijden et al., 2013;Boulard et al., 2015). Requested climatic data to compute the water balance include 2-m air temperature and relative humidity, 2-m wind speed, solar radiation and precipitation.
With the aim of obtaining more reliable estimations of these variables close to the sites of interest, coarse global reanalyses data is downscaled at a sensibly higher resolution using the nonhydrostatic WRF model. Although RCMs are powerful tools for describing regional and local scale climatic conditions, they still feature systematic errors, and small-scale patterns of daily precipitation are highly dependent on the model resolution and physical parameterizations (Giorgi and Mearns, 1991;Laprise, 2008). Biased representation of precipitation intensities and associated temporal and spatial variability often prevent RCM precipitation outputs to be directly used for climate change impact assessment (Fowler et al., 2007;Maraun et al., 2010). A previous analysis documenting the capability of the WRF model to regionalize near-surface atmospheric variables over Burgundy concluded on good skills for simulating the first four aforementioned variables (Boulard et al., 2015), but a clear tendency to over-estimate precipitation amounts, especially those of convective nature (Marteau et al., 2014).
In order to obtain reliable estimators of local-scale bioclimatic conditions, precipitation biases need thus to be post-corrected. Among the empirical-statistical downscaling and error correction methods, the Quantile-Mapping (QM) method (Piani et al., 2010;Heinrich and Gobiet, 2011;Themeßl et al., 2011Themeßl et al., , 2012Gudmundsson et al., 2012;Maraun, 2013) adjusts the distribution of modelled data to observed ones and generally shows high efficiency, particularly for high quantiles (Themeßl et al., 2011).
This study aims thus at (i) documenting how accurately the QM method corrects the WRF precipitation errors, critical for soil water deficit computations; (ii) evaluating the relevance of post-corrected WRF outputs to be used as climatic input data for environmental impact assessment. This is achieved by forcing a forest water balance model that computes a SWDI over the Burgundy region. In order to evaluate the performance of post-corrected precipitation, comparisons are done between water balances of 30 Douglas-fir and 57 common Beech stands computed from both raw and post-corrected WRF outputs, as well as mesoscale SAFRAN («Système d'Analyse Fournissant des Renseignements Adaptés à la Nivologie») atmospheric analyses. SAFRAN analyses (Quintana-Seguí et al., 2008) produce surface atmospheric variables covering France on a regular grid at an 8 km resolution using observations from the automatic, synoptic and climatologic Météo-France station networks, and ECMWF reanalyses (Szczypta et al., 2011). This paper is organized as follows. Section 2 presents the data used and the experimental setup for WRF simulations. Section 3 evaluates the improvement of the post-corrected precipitation upon the raw WRF precipitation. Section 4 focuses on the computation of the water balance, including the Relative Extractable Water (REW) and the Soil Water Deficit Index (SWDI) of the Douglas-fir and common Beech stands, and their relationship to observed radial growth indexes. Results are finally discussed and summarized in Sections 5 and 6 respectively.

Study area
Burgundy is a region located in northeastern France. It is composed by the Yonne, Nièvre, Côte d'Or and Saône-et-Loire departments (Fig. 1a -D03) and covers an area of 31,528 km 2 . It is characterized by an undulated topography, surrounded by the Jura and Alps massifs on the East, and the Massif Central on the Southwest. The topography is mainly characterized by two alluvial plains in the Northwest (Paris basin) and in the Southeast (Saône tectonic trough), rolling hills (maximal elevation of 901 m in the Morvan massif - Fig. 1b) and plateaux running from North to South across the central parts of the region. In 2013, the highly-fragmented land-use was dominated by pastures and croplands (19,000 km 2 , 60%), forests (10,000 km 2 , 31%) and vineyards (3000 km 2 , 1%). The climate of Burgundy is predominantly semi-continental with relatively short, warm summers and cool winters. The annual mean temperature over the period 1989-2008, averaged over the whole region, reaches 11.1 • C. January is the coldest month, with monthly mean temperature reaching 3.8 • C. July presents the warmest monthly mean temperature (19.9 • C). Precipitation is distributed more or less uniformly over the year with two slightly wetter periods occurring in October-November (Autumn) and May (Spring), while March-April and July-August are the driest periods. In terms of processes, convective precipitation predominates between May and September, while stratiform precipitation is prevalent from October to April. Over the period 1989-2008, the average annual precipitation amount reaches 875 mm.

Plot location and ecological data collection
Douglas-fir (Pseudotsuga menziesii) and common Beech (Fagus sylvatica) are the two tree species targeted here due to the high sensitivity of their radial growth to the climatic conditions and soil water deficit events. Field observation and tree coring were performed in March and April 2009 for thirty pure planted stands of Douglas-fir (Sergent et al., 2012), aged of a minimum of 20 years and located on homogenous soils and topography (Fig. 1c). The same sampling strategy was applied for fifty-seven common Beech stands in April and May 2013 (Fig. 1b -Asse, 2013). At each site, a dendroecological plot with a radius of 15 m (700 m 2 ) was established, avoiding edge and gaps.
Diameter at breast height, dominant height, and crown condition were also recorded to characterise the dendrometrical characteristics of each stand. Dominant height was measured on the basis of three of the five trees with the greatest diameter at breast height. 15 trees per Douglas-fir stand and 10 trees per common Beech stand were cored to the pith to evaluate tree age and calculate an annual radial growth index. Basal area increment was computed from ring width and used to characterise radial growth. The mean annual basal area increment was calculated for each plot. To allow growth comparison between plots of different ages, all series were normalized. Final chronologies at the plot level were prepared by averaging the annual residuals to yield a growth index (GI) expressed as a percent of the expected growth under average conditions: see Sergent et al. (2012) for more details.

Climatic data
The present study uses the Advanced Research WRF model (ARW/WRF, WRF hereafter, Skamarock et al., 2008) version 3.1.1. WRF is a non-hydrostatic model, suitable for simulating a wide range of scales, from thousands of kilometres to a few meters, with a large number of available options in what concerns the model core and physical parameterizations, making it appropriate for numerical prediction and climate simulation. WRF was setup with 3 two-way nested domains with respectively, 120.0, 32.5 and 8.2 km horizontal grid spacing and 28 sigma levels on the vertical (Fig. 1a). Lateral forcing is provided every 6-h by European Center for Medium range Weather Forecast (ECMWF) ERA-Interim reanalyses (Berrisford et al., 2009;Dee et al., 2011), from 1000 to 10 hPa (18 vertical levels) at a 1.5 • horizontal resolution. ERA-Interim reproduces observed climate more realistically than other reanalysis products such as ERA-40 or National Centers for Environmental Prediction/National Center for Atmospheric Research (NCEP/NCAR) (Mooney et al., 2010), a result supported over France by the recent verifications by Szczypta et al. (2011). The ERA-Interim reanalyses are generated by an a posteriori integration of the Integrated Forecasting System atmospheric GCM with 6 hourly four-dimensional assimilations of satellite data, buoys and radiosondes, at a T255 spectral truncation with 60 vertical levels.
WRF integration time-step is fixed at 150 s and data are archived every 6 h over 1989-2008, after a 1-year-long spin-up. The 3rd inner domain extending from 45.91 • N to 48.52 • N and 2.58 • E to 5.62 • E, covers Burgundy and adjacent territories, at an 8.2 km horizontal resolution (58 × 67 grid points). A similar domain has been successfully used in several previous modelling studies of the climate variability of Burgundy (Castel et al., 2010;Marteau et al., 2014), and a climate change regionalization exercise (Xu et al., 2012). The physical package includes the Kain-Frisch cumulus scheme (Kain, 2004), the Morrison et al. (2009) scheme for cloud microphysics, and the Yonsei University planetary boundary layer (Hong et al., 2006). Radiative transfers are parameterized with the Rapid Radiative Transfer Model Scheme (Mlawer et al., 1997) for long-waves and the Dudhia (1989) scheme for short waves. Surface data are taken from the Moderate Resolution Imaging Spectroradiometer (MODIS, Friedl et al., 2002) databases, which comprise a 20-category land-use index. Over landmasses, WRF is coupled with the 4-layer NOAH land surface model (Chen and Dudhia, 2001). Soil moisture, which is one of the key parameters that control surface energy partition and water budget, is initialized from the ERA-Interim data. The four (10, 30, 60, and 100 cm) layers of the NOAH model are interpolated from the four ERA-Interim layers (7, 28, 100 and 255 cm depth) of soil moisture and temperature data. Sea surface temperatures are prescribed every 6 h by linear interpolation of monthly ERA-Interim sea surface temperatures. A lateral buffer zone used to smooth the relaxation between the model and the prescribed atmospheric forcing is made of five grid points (1 grid point of forcing plus 4 grid points of relaxation).
The assessment of the WRF simulation errors is performed by cell-to-cell comparison with the gridded high-resolution SAFRAN analyses after a simple nearest neighbour resampling. SAFRAN analyses are made on 615 climatically homogenous zones taking topography effects into account. Vertical profiles of temperature, humidity, wind speed and cloudiness are interpolated at the hourly time-step by combining an optimal interpolation every 6 h, and a variational interpolation over 6 h windows (Durand et al., 2009). Precipitation analysis is performed daily. The solar radiation is calculated using a radiative transfer scheme (Vautard et al., 2013). Quintana-Seguí et al. (2008) and Vidal et al. (2010) already assessed the quality of SAFRAN analyses over France and highlighted that the increasing number of ground observations improves its robustness over time, especially for 2-m air temperature and precipitation. Relative humidity, solar radiation and wind speed are the variables most affected by the scarcity of observations at the regional scale (Boulard et al., 2015). A detailed analysis of WRF climatic variables against observed or high-resolution reanalysed data (e.g., SAFRAN) was performed in Boulard et al. (2015). Variables implied in the computation of potential evapotranspiration were found to be quite well simulated by WRF, while larger biases were obtained for precipitation, which are further discussed in the present study.

Soil water deficit
Soil water deficit was computed using the daily lumped water balance model for forests Biljou © (Granier et al., 1999). Biljou © (https://appgeodb.nancy.inra.fr/biljou/) is a process-based model, which calculates water fluxes (interception, transpiration, actual evapotranspiration, drainage) and soil water content at a daily time-step. The model computes reference Evapo-Transpiration (ET 0 ) using the Penman (1948) formula, with forest albedo. For each stand, Biljou © requires soil parameters (maximum extractable water, bulk density, and water content at permanent wilting point for each soil layer) and stand parameters (leaf area index, fine root distribution). A soil pit was dug to describe the soil profile (depth, texture, coarse element fraction, fine roots distribution in each horizon) to calculate extractable soil water using the soil texture coefficient and the coarse element fraction of each layer (Wösten and van Genuchten, 1988) for Douglas-fir stands (Sergent et al., 2012). No soil pit being dug to describe the soil profile for common Beech stands, Biljou © was parameterized for broadleaves species with an average 6 m 2 .m −2 leaf area index and the soil water deficit indexes were calculated for a calcosol soil type with a soil layer of 104 mm extractable water (Bréda et al., 2010 in Brisson andLevrault, 2010) close to the average value of the 57 common Beech stands soil layer. The budburst day was fixed at the 100th Julian day and the leaf fall day was fixed at the 300th Julian day (Lebourgeois et al., 2006).
Water stress is assumed to occur when relative extractable soil water (REW) drops below a threshold of 40% under which transpiration is gradually reduced due to stomatal closure (Granier et al., 1999).
The REW is computed following the equation: Where: EW is the Extractable Water: EW = W − W m ; EW M is the Maximum Extractable Water: EW M = W F − W m ; W is the available soil water; W m is the minimum soil water (i.e. lower limit of water availability); W F is the soil water content at field capacity. The SWDI (dimensionless), which cumulates the difference between REW and the critical REW (REW c ) at which tree transpiration begins to decrease, is computed following the equation: Where: Day-by-day computation of soil water content includes temporal autocorrelation, as REW is never reinitialised. The soil water deficit index calculated by Biljou © corresponds to the sum of the daily deviations between the relative extractable water and a critical value of 40%. See Granier et al. (1999) for more details.
For each Douglas-fir and common Beech stand, the soil water balance was computed using daily WRF data (global radiation, wind speed, air temperature, relative humidity to compute ET 0 -Penman, and rainfall) and compared to that obtained from SAFRAN, used as a reference input data. Annual soil water deficits obtained from both data sets were related to mean annual growth index using a linear regression for the period 1989-2008.
For precipitation, the Empirical Cumulative Density Function (ecdf) of a control simulation is first matched with the ecdf of the observations (here SAFRAN data), generating a correction function depending on the quantiles. Since the biases in the WRF simulations over Burgundy are mostly found in spring and summer, from May to August, due to a large over-estimation of convective precipitation (Marteau et al., 2014), the QM is applied over the summer convective season, independently for each month, on a daily basis (t) and for each grid cell (i) separately. It results in a corrected time series Y cor in Eq. (4) using a correction function (CF) defined in Eq. (5) (Amengual et al., 2012).
CF represents the difference between the observed (obs) and the modelled (mod) inverse ecdf (ecdf −1 ) for the respective day of the year (doy) in the calibration period (cal) at probability P. P is obtained by relating the raw climate WRF output X raw to the corresponding ecdf in the calibration period. A difficulty arises for precipitation since RCMs tend to overestimate the number of days resulting in trace values, but also to underestimate the number of non-rainy days, thus resulting in an unrealistic probability of precipitation in the simulations (Amengual et al., 2012). To overcome this issue, and following Amengual et al. (2012), an additional constraint is imposed: the ratio of non-rainy days between predicted and control simulated raw data is maintained for the calibrated versus observed series. This is done by transposing differences between predicted and control precipitation quantile before its implementation over corrected series. However, three main limitations still remain: (i) the temporal ans spatial autocorrelations of the series are not corrected, while they are known to be biased (Boé et al., 2007); (ii) correction is only performed for precipitation, which is not independent of biases in temperature or solar radiation; and (iii) the energy and water budget in the simulations is no more conservative.
Following the same protocol, the QM method is applied through a temporal jack-knife cross-validation framework, which repeatedly divides the data period into a calibration (19 years - Fig. 2a) and independent validation period (1 year - Fig. 2b). Each year is estimated and evaluated independently with the remaining 19 years used for model calibration ("leave-one-out" method). The crossvalidation is useful to estimate the efficiency of the QM method (Fig. 3).

Post-correction of the precipitation biases
The aim of this section is to evaluate the capability of the QM method to reduce biases and produce an accurate estimation of precipitation before using any impact model. In order to assess the applicability of the QM post-correction to a climate simulation setup, comparisons are performed between WRF simulation and SAFRAN precipitation, the latter presenting small discrepancies with observations (Boulard et al., 2015) over the period 1989-2008.
Detailed analysis of the raw WRF capability to simulate precipitation over Burgundy is given in Marteau et al. (2014) and Boulard et al. (2015). Boulard et al. (2015) show an over-correction of the ERA-Interim dry bias in annual precipitation amounts (from 800 mm (ERA-Interim) to 997 mm (raw WRF) against 870 mm (SAFRAN)), but a rather realistic spatial distribution in the WRF model (Fig. 4a). Too wet conditions appear over most parts of the region, especially the southwestern side of the Morvan massif located under the direct influence of the dominant (westerly and south-westerly) winds (Fig. 4d). The overestimation of annual precipitation amounts does not exceed 14% on average, with largest differences found during the spring and summer seasons, when convective processes are predominant. The non-stationarity of these biases in space and time (larger during the vegetative season) makes it necessary to apply a post-correction on simulated precipitation, before using them to feed impact studies. Fig. 2a shows a quantile-quantile (Q-Q) plot of the raw simulated daily precipitation against the SAFRAN daily precipitation average over the domain for the 20-calibration set of 19 years. A Q-Q plot of the corrected precipitation against SAFRAN for each of the 20 years is used for validation. For each calibration set (Fig. 2a), uncorrected WRF tends to slightly overestimate SAFRAN precipitation. This overestimation is present for almost all rainy events. The QM method applied to every validation year (Fig. 2b) properly adjusts the intensities of light (from 1 to 5 mm) and medium precipitation events (from 5 to 15 mm) but tends to slightly over-correct the high precipitation events (>15 mm) for almost all years, mostly due to the high interquantile range for extreme precipitation events according to SAFRAN. However, these residual errors are generally smaller than those simulated by the model. The cumulative distribution function (Fig. 3) further shows that the QM method highly improves simulated precipitation and reduces the overall overestimation, with the cross-validated corrected precipitation accurately fitting the SAFRAN distribution.
Post-corrected WRF (Fig. 4b) shows realistic precipitation amounts and spatial distribution. As for SAFRAN (Fig. 4c), largest amounts are located over the western slopes of the Morvan massif (annual mean precipitation >1500 mm), and minimum amounts over the surrounding plains (<700 mm). The post-correction of WRF precipitation strongly reduced the differences between WRF and SAFRAN throughout Burgundy (compare Fig. 4e with Fig. 4d). Over the whole domain, the Root Mean Square Error (RMSE) between SAFRAN and raw WRF annual precipitation is reduced from 328 mm to 177 mm when working on post-corrected WRF precipitation, but some local differences remain. A wet (though reduced) bias (100-200 mm) is still noted over the windward side of the Morvan, and some dry biases (about 100 mm) are found over localized areas to the east of the Morvan and parts of the surrounding plains. These dry biases suggest an over-correction of the lowest precipitation amounts prevailing in the grid points located over the plains (Fig. 4e). Even if some overestimation remains for the wettest grid points, a general improvement is clearly noticeable for the high precipitation amounts (Fig. 4f).
Post-corrected WRF also produces satisfactory results when considering the mean annual cycle ( Fig. 5a; note that cycles were filtered using a 31-day running mean for readability). Post-corrected WRF presents a moderate but slightly increased co-variability with SAFRAN (r = 0.54 against r = 0.48 for raw simulations). The correction also strongly reduces the average daily precipitation differences (Table 1) from −0.35 mm to −0.01 mm. Raw WRF shows much more skill to reproduce precipitation amounts in winter, mostly dominated by large-scale stratiform precipitation and hence produced by the model's microphysics, than during the convective (summer) period, during which the cumulus scheme is of primary importance in the model. The largest differences found in spring and summer (May-August) are highly reduced by the QM correc-tion (Table 1). A moderate underestimation (0.5 mm day −1 ) even appears in August and September. At the interannual timescale, corrected annual precipitation amounts (Fig. 5b) also show a strong co-variability with SAFRAN (r = 0.89). Post-corrected WRF standard deviation (115.1 mm) is much closer to SAFRAN (111.5 mm) than raw WRF (132.6 mm). The systematic overestimation of the annual precipitation amounts (127.5 mm) is dramatically reduced (−7.1 mm) with the QM method. The very dry years of 1991, 2003 and to a lesser extent 2005, which are too wet in the model due to the overestimation of convective precipitation, are now close to SAFRAN. Due to the post-correction, lower precipitation amounts appear for years where raw WRF was closer to the SAFRAN precipitation amounts, but the differences do not exceed −97.5 mm. Local interannual correlations (Fig. 5c) show that post-corrected WRF performs rather well (correlations ranging from 0.57 to 0.89) with an average value of r = 0.77 over the region (against r = 0.70 when working on raw WRF precipitation).
The geography of annual precipitation amounts is accurately reproduced during all the years of the period (Fig. 6a). Spatial correlations between post-corrected WRF and SAFRAN range between 0.87 and 0.95 (in 2003 and 1990, respectively) against 0.64 and 0.86 when working on raw WRF data. These systematically higher spatial correlations indicate that the QM method also improves the spatial distribution of precipitation. When working on annual anomalies however (e.g. after removal of the climatological mean field), WRF produces less satisfactory results (Fig. 6b). Spatial correlations are lower and even negative 4 years out of 20, denoting a lower skill for simulating departures from the climatology, and thus climate interannual variability (and associated effects on local precipitation). The QM method only slightly improves the simula- tion of annual anomalies. However, the RMSE statistics calculated over Burgundy shows a clear improvement upon the raw WRF data for 19 years out of 20.
Section 3 illustrates the usefulness of the QM method to decrease WRF-simulated precipitation errors. In order to use WRF-simulated precipitation to feed impact models, an accurate seasonal cycle as well as a good interannual variability is required. The QM method (applied over the convective/vegetative period over Burgundy) presents a high improvement of both the seasonality and interannual variability. Especially, dry years like 1991, 2003 Fig. 4. a: Annual mean precipitation amounts (mm) climatology over the period 1989-2008 according to raw WRF. b: As (a) but for post-corrected WRF. c: As a but for SAFRAN. d: raw WRF differences against SAFRAN, period 1989-2008. e: As d but for post-corrected WRF. Only differences that are significant at 95% according to a t test are presented. f: scatter-plots of annual mean precipitation amounts for 1989-2008 at all grid-points of the studied area: raw WRF vs. SAFRAN (yellow) and post-corrected WRF vs. SAFRAN (black). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) and 2005, are much more in accordance with observations. The following section assesses the applicability of post-corrected WRF outputs to compute soil water balance.

Computation of the water balance for coniferous and broadleaved forests
To assess the reliability of the WRF post-corrected precipitation, the daily soil water balance is computed using the process-based modelBiljou © (Granier et al., 1999). The raw WRF, post-corrected WRF and SAFRAN meteorological variables nearest to the 57 common beech stands (mostly located over northern Burgundy, see Fig. 1c) and the 30 Douglas-fir stands (mostly located over central and southern Burgundy, see Fig. 1d) are alternatively used as climatic forcings. Specific soil and stand parameters are used to compute the soil water balance of each individual plot.

Computation of the relative extractable water
Figs. 7a and 8a present the mean annual cycles of averaged Relative Extractable Water (REW) computed with the uncorrected and post-corrected WRF, as well as SAFRAN data, for the 30 Douglas-fir stands (Fig. 7a) and for the 57 common Beech stands (Fig. 8a) over 1989-2008. For Douglas-fir stands (Fig. 7a), WRF overestimates the REW from May to October. Post-corrected WRF data sensibly reduce the REW overestimation from spring to early July, but a noticeable underestimation is found from late July to September. For common Beech stands (Fig. 8a), results show lower discrepancies between the mean annual cycles derived from SAFRAN and raw WRF data. The post-correction leads to a strong underestimation of the REW from June to December.
Averaged over the period 1989-2008, common Beech stands are exposed to soil water deficit during 61 days. For every year, the post-correction leads to an increase of the number of days exposing stands to soil water deficit (from 64 days to 86 days). However, the post-correction improves the median date of the beginning of the soil water deficit period (post-corrected WRF: June 26th day; raw WRF: June 30th day against June 26th day according to SAFRAN). It also reduces the absolute difference of soil water deficit start date to 13 days against 20 days with raw WRF data. Consequently, the post-correction tends to slightly improve the beginning of the soil water deficit period. Although, it expands the soil water deficit duration, mostly over summer and fall.
For Douglas-fir stands, WRF presents a shorter duration of soil water deficit than SAFRAN (respectively 51 against 61 days). The post-correction generates a too long averaged duration (75 days) of soil water deficit. From year to year, the post-correction systematically increases this duration, reducing differences for years when the duration was highly underestimated, and increasing differences for the opposite case. It also slightly degrades the soil water deficit start day (From May 24th to June 1st against the May 26th according to SAFRAN) and does not improve the absolute difference of soil water deficit start date.
Summertime REW underestimations for both species are a major issue, suggesting that the post-correction induces too many drops below the 40% threshold, critical for soil water deficit. This may be explained by the weakness of the quantile mapping to accurately remove biases and to improve the timing of the precipitation. The WRF systematic overestimations of the cumulative annual precipitation is the result of the combination of at least the three following daily situations: (i) a slight non-observed rainfall event i.e. too much wet day, (ii) a large over-estimation of observed rainfall events and (iii) an under-estimation of observed rainfall events. The two first points are the main causes of the systematic overestimation by WRF. The wet day correction attempts to equalise the fraction of days with precipitation between the observed and the modelled data. The empirical probability of nonzero observations is found and the corresponding modelled value is selected as a threshold. Hence, all modelled values below this threshold are set to zero even if precipitations were really observed during this day. Due to both the over-estimation and the robust empirical quantile-quantile procedure used, the quantile mapping leads to an asymmetric correction that reduces more the strong rainfall events than they improve the under-estimated rainfall.
Meanwhile, SAFRAN slightly underestimates ET 0 mostly during the summer season, with a bias reaching 1 mm/day, a consequence of too weak wind speeds and solar radiation (Boulard et al., 2015). The combination of ET 0 errors in SAFRAN and the low or dry biases of the post-corrected WRF over northern Burgundy, where are located common Beech stands, contributes to explain this underestimation of the REW. Other potential contributing factors explaining the poor improvement of the model performance to reproduce the annual cycle of the soil water deficit will be discussed below.

Computation of the Soil Water Deficit Indexes (SWDI)
Since SWDI corresponds to the sum of the daily deviation between the REW and the critical value of 40%, it is both the timing and frequency of the 40% REW threshold crossings that determine the capability of the WRF model to produce a consistent SWDI. Figs. 7b and 8b present the annual averaged SWDI computed with raw WRF, post-corrected WRF and SAFRAN data averaged respectively for the 30 Douglas-fir and 57 Beech stands. For Douglas-fir stands (Fig. 7b), the differences between the REW computed from WRF and SAFRAN are relatively small (Fig. 7a), but strong errors in the SWDI nonetheless appear between the three datasets. The interannual variability of SWDI is poorly reproduced by WRF (r = 0.41 with SAFRAN). Post-correcting WRF rainfall results in a quasi-negligible improvement of the SWDI interannual variability (with correlations increasing to 0.45). The most noticeable  differences between raw and post-corrected WRF are found in the SWDI intensity, which is larger when selecting post-corrected precipitation. This result also holds for Beech stands (Fig. 8b). In accordance with the REW underestimation (Fig. 8a), the postcorrected precipitation generates a stronger SWDI than SAFRAN. The post-correction marginally increases the interannual correlations (from r = 0.47 to 0.48).
It is known that annual soil water deficit strongly reduces tree ring growth (Lebourgeois et al., 2005;Sergent et al., 2012) for both tree species. For common Beech, cumulated water shortages experienced by the trees between the beginning of the vegetative period and the end of July during both the current year and the previous year can explain up to 67% of the radial growth over the 1960-2012 period (Asse, 2013). The interannual variations in the growth index Fig. 9. a: Time-series of 1990 daily precipitation (mm) simulated by WRF (yellow curve) and SAFRAN (orange curve), averaged over the whole Burgundy. b: As a but for post-corrected WRF data (black curve). c: as a for 1996. d as b for 1996. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Table 2
Correlations between annual SWDI and radial tree growth indexes over the period 1989-2008. SWDI is computed from SAFRAN (SAF), raw WRF (WRF), post-corrected WRF (WRFpc) precipitation data. GiD represents the Douglas-fir radial growth index. GiB represents the common Beech radial growth index for the year y and for the year y−1 .

SAF WRF WRFpc
(Section 2.4) are negatively correlated with interannual variations in the SWDI (Sergent et al., 2012). Figs. 7c and 8c present the interannual variations of the Douglas-fir (GiD - Fig. 7c) and common Beech (GiB - Fig. 8c) growth indexes and SWDI computed from the three climatic datasets. Interannual variations in GiD mirror the SWDI computed from SAFRAN, the correlation reaching r = −0.80 (Table 2). It is much weaker for WRF uncorrected data (r = −0.21).
The post-correction does not lead to a significant improvement (r = −0.30). Results do not show major differences when working on the GiB and soil water deficit indexes for the same year (Fig. 8c). GiB is highly correlated with the SWDI (Table 2 -r = −0.64). This relationship is much weaker with WRF data, either raw (r = −0.22) or post-corrected (r = −0.32). In both cases the correlations do not reach the 95% statistical significance level. WRF data show more encouraging results between the growth index and the previous year (y−1) SWDI (Table 2). Raw WRF data is actually quite skilful at reproducing the one-year lag correlation between GiB and SWDI (r = −0.47 for WRF SWDI against r = −0.53 for SAFRAN). The post-correction again slightly improves the correlations (r = −0.50).
Post-correcting WRF-simulated precipitation does not significantly improve the relationship between the tree growth and the soil water deficit indexes. Although it sensibly reduces biases in precipitation amounts, the quantile-mapping bias correction is not sufficient to allow for a realistic simulation of SWDI for both Douglas-fir and common Beech stands. For Douglas-fir stands, differences between the mean SWDI computed with WRF and SAFRAN (Fig. 7d) present a heterogeneous distribution with larger differences (ranging from 10 to 35%) over the northeastern and some of northwestern locations, on both sides of the Morvan massif, while smaller differences (<10%) are located in the center and southern part of the domain. Interannual variations of SWDI for the 30 Douglas-fir stands and using WRF data and SAFRAN as climatic input present an average correlation of 0.34 over the period 1989-2008, with higher and significant correlations found for stands located on the southern and windward side of the Morvan massif. Post-correcting WRF outputs leads to an overall improvement of the correlation (r = 0.39) with SAFRAN (Fig. 7e). Yet, this slightly increases the differences between mean SWDI for numerous stands (mostly concerning the southern and western stands) that presented small errors when using raw WRF outputs as a climate forcing, while it slightly reduces the differences for the northernmost stands. When working on Beech stands SWDI (Fig. 8d  and e), the post-correction slightly improves the correlation for almost every stand (r = 0.36 to r = 0.43) but generates an overall increase of the SWDI differences (reaching 40%) in the westernmost stands.

Discussion
The poor skill of WRF rainfall, even after a statistical biascorrection, to simulate a realistic summer soil water depletion is troublesome if one is to use these climate outputs in impact studies on tree growth, because radial growth limitation depends on the date of establishment of the summer water deficit. The present section examines the possible causes of this deficiency. Two main aspects should be considered: (i) heavy precipitation events, poorly reproduced by WRF while directly affecting the dura-tion of droughts episodes (Gómez-Navarro et al., 2014); and/or (ii) the timing and phasing of high-frequency (transient) precipitation variability, which could sensibly differ in WRF compared to observations (and probably the forcing reanalyses), and could result in unrealistic wet and dry sequences at synoptic and intraseasonal scales.
In order to evaluate the skill of the simulation in reproducing the timing and severity of heavy rainfall events, we first use a contingency table comparing the daily rainfall amounts synchronously between SAFRAN and WRF (Table 3). WRF tends to produce a quite realistic distribution of the classes of daily precipitation over the 1989-2008 period as a whole. For light (from 1 to 5 mm) daily precipitation intensities, WRF shows a good skill but produces fewer dry and drizzle days (<1 mm) compared to SAFRAN. The bias correction improves the latter, reducing the difference with SAFRAN from 339 to 196 days. Raw WRF also tends to produce too many medium (from 5 to 15 mm) and heavy precipitation days (>15 mm), some of them corresponding to dry or drizzle days in the SAFRAN times-series. This bias is strongly reduced after the post-correction. However, the opposite situation, i.e. dry conditions simulated by WRF under very wet conditions (top right corner in Table 3) is unchanged after the post-correction. Although these are not very common instances, these wet events may result, when occurring Table 3 Contingency table between raw WRF (rows -red), post-corrected WRF (rows -blue) and SAFRAN (columns) daily precipitation amounts (mm) data averaged over Burgundy and sorted by classes over the period 1989-2008 (7300 days e.g. in the middle of a dry spell during the warm season (as would be the case for an isolated storm), in increased soil moisture conditions. The absence of some of these events in WRF may result in a persistent soil water deficit, hence an overestimation of SWDI. The contingency table tends to show that WRF is capable to reproduce the severity of the events, especially after the post-correction, but still presents some difficulties to simulate a realistic timing. We examine and discuss next some of the factors that could cause the low WRF ability to reproduce the temporal distribution of rainfall through two case studies chosen as two years with contrasted performance in the simulation of precipitation. 1990 and 1996 were selected because they exhibit respectively the lowest (9 mm) and the highest (257 mm) annual precipitation amount differences between SAFRAN and WRF, and also the highest overestimation and underestimation of the SWDI over the forest stands.
In 1990 (Fig. 9a), WRF precipitation amount (864 mm) averaged over Burgundy is similar to SAFRAN (855 mm). The daily precipitation time-series are also highly correlated (r = 0.67). WRF performs well to reproduce wet spells and their intensity in winter, but shows as expected more difficulties in spring and summer. During the convective period, WRF overestimates rainfall amounts during most rainy days. Yet, some rainy days in SAFRAN are not simulated or are too dry in WRF (e.g. from June 15-20 and in late August). In 1996 (Fig. 9c), WRF strongly overestimates the annual precipitation amount (1079 mm against 882 mm for SAFRAN), due to a quasi-systematic overestimation of wet spells throughout the year. The daily precipitation times-series are significantly correlated (r = 0.65). However, WRF simulates too many rainy days, corresponding to dry days or light rainy days in SAFRAN, mainly occurring in summer. As 1990, WRF generally succeeds at simulating heavy rainy days in summer and fall, in spite of biased intensities. Fig. 10 shows that the failure of WRF is mainly attributable to its inability to correctly reproduce some rainfall events and their intensity, rather than a shift in the reproduction of dry and wet sequences by the model. In 1990, two major rainy events occur in SAFRAN (in early June and late August) allowing for a partial recovery of the water storage of forest stands ( Fig. 10a and b). These events are far too dry in WRF, leading to an underestimated REW and thus to biased SWDI duration and intensity.
For 1990, the QM method (Fig. 9b) produces a dry bias (-79 mm) in WRF, with the annual amount decreasing to 776 mm (against 855 mm for SAFRAN and 864 mm for raw WRF), suggesting for this year an obvious over-correction. The correlation between the simulated and SAFRAN daily time-series is barely modified (r = 0.69 instead of 0.67 for raw WRF). In 1996, the post-correction allows reducing the annual precipitation bias to 111 mm (vs. 197 mm) and leaves the daily correlation unchanged (r = 0.67, vs. 0.65 before correction). The post-correction tends to reduce rainy events differences over the whole year against SAFRAN but also leads to reduce the rainy events already underestimated by WRF (e.g. in early June). These changes fail at fully correcting the 1996 wet bias, which explains why the simulated dry spells during which the REW is below the 40% threshold remain too short ( Fig. 10c and d). For example, WRF generates rainy events during the middle of June, not found in the SAFRAN chronology, allowing a partial recovery of the water storage for both species. Those events lead to a start of the soil water deficit period postponed to the middle of July while it starts in late June with respect to SAFRAN.
Contingency tables (Tables 4 and 5) show that WRF is capable to reproduce the timing of the classes of precipitation in both years. In 1990 (Table 4), WRF produces a rather realistic number of dry days. However, it generates too many drizzle days (≥1 and <5 mm) and tends to underestimate rainy days for almost every class. In 1996 (Table 5), WRF rainfall shows excesses in all classes, except dry days and heaviest rainfall events (>20 mm). The QM method (Table 5) mostly reduces the number and intensities of rainy days. Nevertheless, some important errors remain, including important dry biases during heavy rainfall conditions. In 1990 and 1996, 7 days received more than 20 mm in SAFRAN but 1-15 mm in post-corrected WRF. These errors strongly affect the water storage recovery and may delay the water storage draining. WRF detects all occurrences of extreme event (none of the 9 corresponding days belonging to the dry class), but generally underestimates associated amount. Thus, the QM method does not significantly improve the realism of extreme convective events in WRF simulations.  (row -red: 1990; row -blue: 1996) and SAFRAN (columns) daily precipitation amounts (mm) data averaged over Burgundy and sorted by classes. p corresponds to the p value for the X 2 test statistic. (For interpretation of the references to colour in this The phasing of transient climate variability, which is not improved by QM (Themeßl et al., 2011), could become a major issue for impact studies, especially for non-linearly derived indices such as threshold-based indexes. This issue is explored in Fig. 11. As Fig. 5, it presents the average annual cycle (Fig. 11a) and the interannual variability of precipitation simulated by SAFRAN and WRF. We constructed two additional time series, corresponding to the raw WRF and post-corrected WRF, which were reordered to fit to the SAFRAN distribution over the period 1989-2008. When working on the reordered WRF data, both annual cycles fit almost perfectly (r = 0.99). Since the post-corrected WRF data presents a residual precipitation bias of −0.01 mm/day, the post-corrected reordered WRF annual cycle exactly fit to SAFRAN. These results are also verified at the interannual timescale (Fig. 11b). This logically leads to an improvement of the REW and SWDI (not shown) since the primary differences between SAFRAN and post-corrected WRF were related to the failure of the model to reproduce an accurate annual cycle. As expected, reordered raw WRF rainfall are characterized by a systematic wet bias reducing the number of 40% REW threshold crossing. Thus, improvements in the REW and SWDI temporal variability are lesser than with the quantile mapping post-correction.
Thus, two types of model errors, occurring independently, are of primary importance for impact studies: (i) the timing of transient climate variability that controls the duration and the number of consecutive days exposed to a water stress; (ii) the probability density function of daily rainfall that controls the number of 40% REW threshold crossing and then, SWDI intensity.

Conclusion
Assessing the regional impacts of climate variability on the water balance requires reliable observational data representative of a given geographical area. Unfortunately such data is often unavailable at fine scales. Regional climate models can constitute a potential alternative to scarce observational networks and for applications involving future climate projections (Bell et al., 2011). Nonetheless, their usefulness is questioned by the realism of their simulated climate, especially precipitation. Using the stateof-the-art RCM WRF driven by ERA-Interim reanalyses, this study attempts to document the skill of the model for regionalizing all the components used to compute soil water balance, requested for many impact studies. Here, the downscaled results are compared with the SAFRAN meso-scale analyses that present low discrepancies with observations for precipitation, and compensate the lack of available observation to compute water flux and soil water availability. Like most current climate models, WRF shows strong limitations for simulating precipitation, especially of convective nature. Although the magnitude of WRF errors may be considered in first approximation as satisfactory for regional climate studies, our results suggest that, for the simulation of soil water availability (and its consequences on forest health and productivity, the example used in this study), such biases cannot be neglected, and require a post-correction in order to obtain reliable estimators of local-scale bioclimatic conditions. The so-called quantile mapping post-correction uses the empirical cumulative distributions of the observed and modelled precipitation, reduces most systematic biases in the statistical distribution of the variable to be corrected (daily precipitation in the present study), and also improves the adjustments of extremes. The method modifies each RCM precipitation event by first calculating the probability of the event. The downscaled precipitation amount is then modelled as the observed precipitation with the same probability (Wetterhall et al., 2012). In this study, the QM method is applied over the convective period, which corresponds to the vegetative season for both common Beech and Douglas-fir over Burgundy. It allows for major improvements of precipitation in terms of seasonal and interannual variability, and also leads to more realistic annual amounts and spatial distribution. In spite of the usefulness of the QM method to produce realistic precipitation distributions, there still remains a deficiency in the simulated precipitation that prevents to use it for impact studies. This is demonstrated by the fact that key indicators of the water balance, such as the extractable water and soil water deficit index, remain poorly reproduced even after applying the post-correction to the precipitation data. Hence, the simulated precipitation, after the post-correction, is inappropriate for computing radial growth, the interannual variability of which is strongly dependent upon actual SWDI.
Our results suggest that, for impact studies, biases in the timing/temporality of precipitation (especially in the intra-seasonal distribution of the rains), not corrected by the quantile mapping procedure, is yet of primary importance. Although the QM method allows producing more realistic precipitation as a whole, it still features deficiencies in the day-to-day timing of precipitation, a statement particularly true as far as some severe droughts or high intensity events (corresponding to consecutive dry or heavy precipitation days) are concerned. This denotes either stochastic and unpredictable rainfall events in the model (as would be the case for instance for isolated storms in the summer period), or possible timing errors in the transient variability simulated by the model.
A first way to improve the simulation of precipitation would be to use more sophisticated model output statistics techniques, as a distribution scaling using a fitted gamma distribution conditioned on objectively classified Lamb weather types (Wetterhall et al., 2012) or a nested multisite daily rainfall stochastic generation model in order to improve the precipitation temporal characteristics (Srikanthan and Pegram, 2009). A second way would be an improvement of the simulated precipitation physically through finer cloud-resolving downscaling exercises. Finally, a third way would be to use relaxation terms in the model prognostic equations (nudging: Heikkilä et al., 2010;Glisan et al., 2013;Pohl and Crétat, 2014) to prescribe transient variability with a more realistic phasing compared to the observed time series.