Improving the physics of a coupled physical–biogeochemical model of the North Atlantic through data assimilation: Impact on the ecosystem

Several studies on coupled physical–biogeochemical models have shown that major deficiencies in the biogeochemical fields arise from the deficiencies in the physical flow fields. This paper examines the improvement of the physics through data assimilation, and the subsequent impact on the ecosystem response in a coupled model of the North Atlantic. Sea surface temperature and sea surface height data are assimilated with a sequential method based on the SEEK filter adapted to the coupling needs. The model domain covers the Atlantic from 20°S to 70°N at eddy-permitting resolution. The biogeochemical model is a NPZD-DOM model based on the P3ZD formulation. The results of an annual assimilated simulation are compared with an annual free simulation. With assimilation, the representation of the mixed layer depth is significantly improved in mid latitudes, even though the mixed layer depth is generally overestimated compared to the observations. The representation of the mean and variance of the currents is also significantly improved. The nutrient input in the euphotic zone is used to assess the data assimilation impact on the ecosystem. Data assimilation results in a 50% reduction of the input due to vertical mixing in mid-latitudes, and in a fourto six-fold increase of the advective fluxes in mid-latitudes and subtropics. Averaged zonally, the net impact is a threefold increase for the subtropical gyre, and a moderate (20– 30%) decrease at mid and high latitudes. Surface chlorophyll concentration increases along the subtropical gyre borders, but little changes are detected at mid and high latitudes. An increase of the primary production appears along the Gulf Stream path, but it represents only 12% on average for mid and high latitudes. In the subtropical gyre centre, primary production is augmented but stays underestimated (20% of observations). These experiments show the benefits of physical data assimilation in coupled physical–biogeochemical applications. © 2006 Elsevier B.V. All rights reserved.


Introduction
Marine biogeochemical cycles are at the heart of environmental issues such as coastal eutrophication, the role of the ocean in the global carbon cycle and in the climate change. The need to understand, monitor and forecast these cycles has motivated the development of coupled biogeochemistry-circulation models, which simulate the complex interactions between ocean physics, chemistry and biology. These models have proven to be very useful to interpret observations, to test new hypotheses about the ecosystems functioning, and to quantify the different processes. Moreover, simulations are able to provide past and future scenarios of the interactions between marine biogeochemistry and climate.
However, such models have deficiencies. As they are coupled systems, deficiencies can arise from each part of the system, i.e. physics, chemistry or biology. But circulation alone can explain much of the observed pattern of the biogeochemical fields such as primary production, oxygen and nutrients. Therefore, many deficiencies in the representation of the biogeochemical fields are due to deficiencies in the physical flow fields (Doney, 1999;Oschlies and Garçon, 1999;Doney et al., 2004).
A crucial point for the representation of the primary production and air-sea fluxes is the simulation of the surface boundary layer, and its interactions with the ocean interior (Oschlies and Garçon, 1999;Doney et al., 2004). Mixed layer dynamics, convection, vertical velocities, mesoscale and submesoscale currents as well as diapycnal mixing are fundamental processes for the coupling between physics and biochemistry. They act directly on the nutrient supply, the availability of light, and also on the dissolved organic carbon export (Oschlies and Garçon, 1999;Doney et al., 2004).
In the last decade, various three-dimensional studies on the North Atlantic basin addressed the sensitivity of the biogeochemical solution to the representation of the physics. Oschlies and Garçon (1999) examined the impact of vertical mixing parameterisation and advection numerics on the primary production in the North Atlantic. They showed that the use of a high implicit diffusivity advection scheme can double the primary production in the equatorial region and the Gulf Stream region, leading to unrealistically high production. Also, the diapycnal diffusion parameterisation appears to have a strong impact on the subtropical gyre production. They also identified systematic overestimation in the winter mixed layer depth, and a noticeable impact of the mixed layer model on the primary production.
Furthermore, the interactions between mesoscale and biogeochemistry were recurrently studied (Garçon et al., 2001). From in-situ observations, McGillicuddy et al. (1998) proposed a mechanism of "eddy pumping" of nutrient in the euphotic zone, which enhances biological production. In the meantime, Oschlies and Garçon (1998) obtained an enhanced production by assimilating altimetry in an eddy-permitting model, thus making the mesoscale field more realistic. However, a better representation of the mesoscale, can lead to an opposite effect: Oschlies (2002a,b) showed that increasing the resolution from a 1/3°to 1/9°leads to shallower mixed layer at mid and high latitudes, and thus reduces nutrient inputs.
Idealised studies resolving submesoscale (5-20 km) Spall and Richards, 2000;Lévy et al., 2001) also showed an enhancement of primary production associated with eddies and filaments. With respect to a non-eddy simulation, the primary production enhancement ranges from 10% (Spall and Richards, 2000), 30% (Oschlies and Garçon, 1998) to 200% for Lévy et al. (1998). The origin of nutrient inputs is also discussed by these authors. Recently, two basin-scale eddy-resolving assessments of the mesoscale impact (Oschlies, 2002a;McGillicuddy et al., 2003) led to different conclusions concerning the spatial distribution, and the magnitude of the eddy-induced inputs.
Two ideas emerge from these works. First, physical parameterisations can have strong impacts on the simulated biogeochemical fields, which need to be carefully assessed. Second, errors due to missing physical processes (e.g. mesoscale) and other error sources (initial conditions, boundary conditions and coefficients) also have an important impact on biogeochemical fields, that need further investigation.
To date, few attempts have been made to correct the physics errors through data assimilation of physical data only in coupled models (Oschlies and Garçon, 1998;Anderson et al., 2000).
Here, data assimilation is used to improve the physics in a coupled physical-biogeochemical model of the North Atlantic. Sea-surface temperature (SST) and SSH data are assimilated. SST data provide direct information about the mixed layer temperature. SSH data, through geostrophic balance, provide information about the surface geostrophic currents including mesoscale features, and about the density structure of the water column. Correcting the model with real data has two major consequences: the mean and variance of the physical state are closer to the observations, and the instantaneous state follows the near-synoptic satellite observations over the whole basin. Having more realistic physics potentially allows a more extensive comparison of the modelled biogeochemical fields to the observations. As far as the authors know, it is the first time that both SST and SSH data are assimilated into a three dimensional coupled model at the basin scale.
The objectives of this paper are twofold: to examine the impact of data assimilation on the representation of upper layer circulation compared to the free run, and to assess the sensitivity of the ecosystem to the modified circulation. In Section 2, the physical and biogeochemical models are described, and in Section 3 the assimilation scheme is briefly discussed. Sections 4 and 5 present the experiment conducted and the impact of the data assimilation. Discussion and conclusions are given in Section 6.

The circulation-ecosystem model
The circulation model is coupled on-line to the biogeochemical model. Here coupling means one-way forcing of the ecosystem model by the circulation model, since no feedback of the ecosystem model is taken into account.

Circulation model
In this study, the circulation is simulated by the OPA8.1 code . OPA8.1 is a primitive equation model using the hydrostatic and rigid-lid approximation. Vertical mixing of momentum and tracers is modelled by the TKE turbulence closure scheme (Blanke and Delecluse, 1993), and convection is parameterised with enhanced diffusivity and viscosity. Following the configuration of Tréguier et al. (1999), the model domain covers the North Atlantic basin from 20°S to 70°N and from 98.3°W to 20.6°E. The Southern boundary at 20°S, the Northern boundary at 70°N, and the Gibraltar Strait are closed but the model is relaxed toward climatology in buffer zones along the closed boundaries. Model variables are distributed on an Arakawa-C-type grid (Arakawa and Lamb, 1977). The horizontal resolution is 1/3⁎1/3 cos (latitude) (i.e. 28 km at 40°N), so it is eddy-permitting. Vertical discretization is done on 43 geopotential levels, with a grid spacing increasing from 12 m at the surface to 200 m below 1500 m. The atmospheric forcing fields (momentum, heat and freshwater fluxes) are derived from the 6-h ECMWF analysis. Sea-surface temperature is relaxed to the Reynolds weekly analysis, while seasurface salinity (SSS) is relaxed to the Reynaud seasonal climatology (Reynaud et al., 1998).

Biogeochemical model
Since the early work of Fasham et al. (1990Fasham et al. ( , 1993, and Sarmiento et al. (1993), biogeochemical models have significantly evolved. Recent models designed for basin-scale and global scale studies (Aumont et al., 2002;Moore et al., 2002;Lima and Doney, 2004) include functional groups with size classes, as well as multiple nutrient limitations including nitrogen fixation and varying elemental ratio. However, for the purpose of this North Atlantic study such a biological complexity is not essential. Indeed at first order, the processes affected by the physics are mainly the phytoplankton growth, and potentially the export of dissolved organic carbon. Thus a simple pelagic model is used, based on the P3ZD formulation developed by Aumont (1998).
The P3ZD model was designed to study the carbon cycle at global scale (Aumont, 1998;Aumont et al., 2002;Le Quéré et al., 2003), hence all tracers are expressed in carbon unit. In this study, we only use the ecosystem part of the model, i.e. production, export, remineralisation. The evolution of each tracer concentration C follows an advective-diffusive Eq. (1), with a source-minus-sink (sms) term representing the interactions with the other biogeochemical tracers.
K h is the horizontal diffusivity coefficient equal to 2.5 × 10 19 cm 4 s − 1 , K z is the vertical diffusivity coefficient computed from the TKE turbulent closure model, with no convection parameterisation. Biogeochemical tracers advection uses the low-implicit-diffusivity and positive-definite smolarkiewicz scheme with two corrections (Smolarkiewicz and Clark, 1986). The model compartments and interactions are presented in Fig. 1. See Aumont (1998) for a detailed description of the sms terms. The five compartments are the nutrient phosphate (PO 4 ), the phytoplankton (P), the zooplankton (Z), and the dissolved (DOC) and particulate (POC) organic carbon. Only the semi-labile DOC is taken into account. Therefore this is a NPZDDOM-type model. The depth of the euphotic zone is computed according to the light absorption by water and chlorophyll (Chl), with a maximum value of 143 m. At the surface, photosynthetically available radiation is derived from the daily ECMWF shortwave radiation. The Chl/Carbon ratio is a function of the light and nutrient availability. The phytoplankton growth is limited by light, nutrient, temperature and vertical mixing, and the growth rate ì is computed as the product of each limitation term (Eqs. (2) and (3)).
l ¼ L m l 0 1 À e ÀaI=l 0 e ÀbI=l 0 PO 4 PO 4 þ K PO 4 ð2Þ I being the photosynthetically available radiation, α ((0.05 W m − 2 ) − 1 day − 1 ) the initial slope of the P-I curve, β ((0.02 W m − 2 ) − 1 day − 1 ) the photolimitation coefficient and K PO 4 the half saturation constant (ìmol PO 4 l − 1 ). a, b, (0.6 day − 1 and 1.066) being growth parameters, and T the temperature (°C). L m being the vertical mixing limitation term, such as Z m and Z e being respectively the mixed layer depth and the euphotic depth. Zooplankton grazing follows Fasham et al. (1993) formulation. Below the maximum depth of the euphotic zone, POC is instantaneously exported according to the vertical profile of Suess (1980). Michaelis-Menten law is used for phytoplankton growth, zooplankton grazing and DOC remineralisation.

The sequential data assimilation method
Ocean circulation models are built with approximations. Time and space are discretized, and important knowledge is missing about model parameters, boundary conditions and initial conditions. This leads to biases and errors in the simulation of the real ocean circulation. Sequential data assimilation is a way to tackle these problems, by iteratively correcting the model trajectory with the available observations, taking into account model errors and observations errors.
For OPA8.1, the model state can be described by a vector containing the horizontal velocities U and V, the temperature T, the salinity S and the barotropic stream function BSF. For the purpose of SSH data assimilation, the diagnostic variable SSH is added to this state vector to form the estimation vector. Corrections of each of these variables are estimated from the observations through the assimilation process.

Data sets
The data assimilated are the sea-surface temperature, the sea-surface height and the sea-surface salinity. SST data consist of composite AVHRR measurements gathered and processed within the NASA Pathfinder project, regridded at a 1/4°resolution available every 10 days. SSH data consist of a combination of Topex-Poseidon and ERS altimeter along-track data regridded at a 1/4°resolution, and SSS data are derived from the Levitus monthly climatology . These surface fields are used to correct the whole water column. Six days is the length of the assimilation cycle, and is also the length of the time window used for the collection of the assimilated observations. This duration has proven to be adequate for the filter to be efficient. Testut (2000) and Testut et al. (2003) (hereinafter CET00 and CET03) have implemented and developed the original assimilation method in the North Atlantic configuration.

The original SEEK filter
The method used here is based on this original assimilation method. First we describe briefly the CET00 assimilation method. More details about this implementation can be found in CET03. Now we explain the changes made to render the method suitable with biogeochemical coupling.
The original assimilation method implemented in CET03 is a reduced-rank Kalman filter, derived from the SEEK filter (SEEK for Sequential evolutive extended Kalman filter, Pham et al., 1998). This filter has been successfully used in various oceanographic contexts (Brasseur et al., 1999;Brankart et al., 2003;Parent et al., 2003;Testut et al., 2003). The Kalman filter theory assumes the knowledge of the error covariance matrices of the forecast, the observations and the model, usually named P, R and Q respectively. A well-known limitation in the application of the Kalman filter is the size of P, and the lack of information to construct this matrix. The SEEK filter tackles this problem, by decomposing the covariance matrix P on a reduced-order basis. The background error covariance matrix P is estimated by the dominant threedimensional and multivariate EOF of the free model variability. In addition, the EOF are calculated locally for a small 3D domain around each water column, so that an observation can only impact a user-defined influence zone around the observation point (CET00, CET03). Therefore, the information contained in the surface observations is propagated to the unobserved part of the water column by a linear combination of local EOF.

The vertical extrapolation scheme
Preliminary assimilation experiments with the CET00 scheme showed a strong increase of the vertical mixing of biogeochemical tracers in the surface layers. A detailed inspection of the simulation revealed hydrostatic instabilities in the analysed state. The main origin of this problem lies in the very nature of the method: the filter cannot explicitly take into account non-linear physical constraints on the model state, e.g. the positiveness of the vertical density gradient. The reason for this is the Gaussian error distribution assumption needed to parameterise the probability density function with covariance matrices only. As a consequence, the analysed state can be anywhere in the linear reduced space of the SEEK filter. However, this fact is not restricted to our method: others studies with sequential method  and variational assimilation method (Fujii et al., 2005) have reported the same problem. This observation has motivated the development of a new extrapolation method, hereinafter NEW, which avoids hydrostatic instabilities and allows a better control of the mixed layer. Moreover, corrections are switched off South of 10°N, where assimilation performance is poor for deep layers.
In this method, we only use the surface corrections of T/S fields and SSH estimated by the SEEK filter, and the velocity field correction. The extrapolation is divided in two stages, the mixed layer first, then the ocean interior. Throughout the mixed layer, the same temperature and salinity correction computed at the surface is applied. This means that (T, S) error correlations with the surface values are assumed to be equal to 1 within the mixed layer, and equal to 0 below the mixed layer. Below the mixed layer, following a method similar to Cooper and Haines (1996) a vertical displacement of the isopycnals is applied to account for the correction of sea-surface height. The Cooper and Haines (1996) altimetric data assimilation method is based on the hydrostatic equilibrium, associated with the hypothesis of no bottom pressure change, and with the conservation of potential vorticity on isopycnals. In practice, the correction consists of a rearrangement of the preexisting water masses (i.e. T/S properties) by a vertical displacement of the isopycnals, which compensates the sea surface height correction. This method has been successfully applied in realistic studies (Drakopoulos et al., 1997;Bell et al., 2000). For the sake of simplicity, as the velocity field will geostrophically adjust to the density field during integration, the velocities are those estimated by the SEEK.

Experimental design
To investigate the coupled model behaviour with and without assimilation, annual simulations were performed. A schematic representation of the experiments is displayed in Fig. 2. First a seven-year physical spin-up with climatological atmospheric forcing was performed. Then realistic forcing were applied from 1993 and integration continued until January 1st, 1999. On January 1st, 1997, the biology was inserted, and the coupled model was integrated for 2 years until January 1st, 1999. Initial conditions for phosphate are taken from the Conkright et al. (1998) climatology. Phytoplankton, zooplankton POC and DOC compartments were set constant in the first 143 m, and null below. Initial values are 0.25 for P, 0.0625 for Z, and 0.04 × 10 − 6 molC l − 1 for POC and DOC.
After 1 year on January 1st, 1998, biology is already spun-up, and the year 1998 can be exploited. This experiment is our reference run, named FREE.
The assimilated experiment was conducted in two stages. First, assimilation into the physical model was switched on in August, 1996 and the assimilated run continued until January 1st, 1997. During this assimilation spin-up period, the CET00 scheme was used, because it is better suited to correct the systematic biases observed in the free run (CET03) than the NEW scheme which only rearranges the water masses. In the second period, the NEW assimilation scheme was used. Spunup initial conditions for biology from the FREE experiment were inserted on January 1st, 1998. The coupled model with assimilation was integrated for 1 year until January, 1st 1999. This assimilated experiment is named CORR.
The performance of the new assimilation method is then assessed. In term of RMS errors with respect to the assimilated variables, the results are comparable to those of the CET00 method, except in summer where SST RMS errors are significantly larger (up to 50%, not shown).
We then have two simulations for the year 1998, FREE and CORR, with identical atmospheric forcing. Physical initial conditions are different, but initial biogeochemical conditions are the same. For the FREE simulation, the physical trajectory is continuous and free. For the CORR simulation, the trajectory is iteratively corrected using SSH, SST and SSS, hence the physical environment is expected to be improved. In the following, as no correction is applied South of 10°N, we will focus on latitudes North of 10°N.

Impact of the data assimilation
The impact of the data assimilation will be analysed focusing on the physics of the upper layers first, then on the nutrient input, and finally on the ecosystem response.

Mixed layer depth
Theory and observations (Oschlies and Garçon, 1999) agree that a realistic seasonal cycle of the mixed layer depth (MLD) is essential for the simulation of biological production. Indeed, the MLD maximum, usually reached in late winter determines the annual nutrient reloading of the euphotic zone. Here, (Fig. 3) the mixed layer depth maximum was computed as the actual maximum of the 61 model dumps, to account for any mixing event during the year. The criterion used to define the mixed layer depth is a 0.2°C deviation from the surface temperature.
Overall, not surprisingly, the spatial variability is increased with assimilation, reflecting the high variability of the satellite data. For latitudes North of 50°N, the MLD picture is similar for the FREE and CORR simulations. Nevertheless very deep MLDs resulting from convection are reduced in the Irminger Sea and extended in the Labrador Sea. In the FREE simulation the Gulf Stream position is too North, so the northwestward extension of the 200 m mixed layer depth contour line is due to the warm waters brought by the current along Newfoundland. This point is corrected by assimilation. For mid-latitudes (30-50°N), the abrupt east-northeastward front present in FREE (corresponding to the 200 m depth contour line drawn in Fig. 3) which crosses the whole basin is weakened and displaced northward. North of this front, CORR MLDs tend to be shallower in the eastern part of the subtropical gyre. On the contrary, in the western part of the basin, MLDs are not changed, or even increased. The deepened zone centred on 30°N, 40°W is identified as an artefact of the assimilation method, and can not be physically interpreted.
Mixed layer depth observations from the same period computed with the same criterion are available from De Boyer Montégut et al. (2004). Comparison with these observations, and other data such as climatology shows that mixed layer depths are generally overestimated in the two simulations by 100 m in mid-latitudes to 500 m and more at high latitudes. However, East of 45°W assimilation improves the results by reducing the depths from 400 m to 200 m. Near the American coast, South of 35°N, the overestimation of the mixed layer depths is a common feature of many simulations (Oschlies and Garçon, 1999;Lima and Doney, 2004). It may be due to the drift of the T/S properties associated with the poor representation of the Gulf Stream (observed in the FREE simulation), or to the atmospheric forcing. Our assimilation of surface-only data does not appear to be appropriate to correct this systematic error.

Surface eddy kinetic energy
In the open ocean, the kinetic energy is dominated by the mesoscale variability (Oschlies, 2002b). A measure of the mesoscale variability is the eddy kinetic energy (EKE), usually computed as the variance of the horizontal velocity components. Assimilated SSH is computed as the sum of a mean SSH (MSSH) and a sea level anomaly (SLA), available from space. Therefore SSH observations contain information about the mean surface currents and about their variance, hence SSH data assimilation is able to correct the mean circulation and also the EKE.
The location and magnitude of mean surface currents (not shown) are well represented in the assimilated solution (see CET03). Fig. 4 shows the EKE calculated from simulations and observations. Distribution and magnitude of the EKE in the Gulf Stream, along the North Atlantic Drift and the Azores current are clearly improved by assimilation, but still underestimated. For instance along the Gulf Stream path, simulated maximum values reach only 2000 cm 2 s − 2 while observations reach 5000 cm 2 s − 2 . This underestimation is partly explained by the model eddy-permitting resolution, unable to resolve the whole spatial variability spectrum contained in the observation. It appears that the assimilation impact on the MLD and EKE is satisfactory. But this is not sufficient to extrapolate the impact on the biogeochemical fields. Therefore, an analysis of the nutrient fluxes is required.

Nutrient input in the euphotic zone
Phytoplankton growth is mainly limited by light and nutrients. In the euphotic layer, the growth consumes "regenerated" nutrients (made available from remineralisation of already formed organic matter) and "new" nutrients, brought by the circulation from inorganic sources, e.g. the ocean interior. Therefore the physical nutrient supply in the euphotic zone is a measure of the impact of the physical mechanisms on the primary production. These mechanisms are the vertical mixing and the tri-dimensional advection. Assimilation acts on the two processes, by correcting the stratification and the currents.
The maximum depth of the euphotic zone (Z e = 143 m) is chosen to compute the nutrient input over the whole basin. The inputs are cumulated over the year 1998 and expressed in molC m − 2 year − 1 . The total input is divided into three components: the vertical diffusive input Q zdif , the vertical advective input Q zadv , and the horizontal advective input Q hadv . Following Oschlies (2002c), no distinction is made between horizontal advection and diffusion, as horizontal mixing is partially resolved, and partially parameterised at this resolution. The inputs are computed as follows: where C is the nutrient concentration, T is equal to 1 year and Z e equal to 143 m. In the present model, in the equatorial region the vertical advective input and hence primary production are greatly overestimated (the total input is four times larger than the input reported by Oschlies (2002c)), so the biogeochemical solution cannot be interpreted South of 15°N. Other studies (Oschlies, 2000) showed the high sensitivity of the input to the physics parameterisation and errors in this region. This flaw may be due to the drift of the T/S properties, the atmospheric forcing, or to the closed southern boundary. Except this particular problem, the overall distribution of input in the FREE simulation is similar to the results reported by Oschlies (2002c).

Input from vertical mixing
Not surprisingly, the pattern of Q zdif (Fig. 5) resembles the pattern of the mixed layer depth maximum for the two simulations. As observed by Oschlies (2002c), two different regimes coexist in the North Atlantic in two different areas. The "convective" regime takes place in the Westerlies region and in the high latitudes area, except a region at the North-East of Newfoundland. In these regions, diffusive inputs are superior to 1 molC m − 2 year − 1 and reach 8 molC m − 2 year − 1 (equivalent to 1 molN m − 2 year − 1 ) near the American coast. Everywhere else, the "diapycnal diffusion" regime exhibits inputs inferior to 1 molC m − 2 year − 1 .
Assimilation does not change this pattern. In the CORR simulation, Q zdif is significantly reduced along the American coast, and in the Eastern part of the domain. Conversely, Q zdif increases in the Labrador and Irminger seas. In these regions, a comparison with the T/S climatology indicates that the stratification is underestimated, so this increase is probably overestimated. Moreover, the spatial extent of the convective regime is limited. The input spots located in the subtropical gyre centre are due to unrealistic mixing events associated with unbalanced corrections, but are negligible with respect to the total input.
In order to quantify the changes, the domain was divided into three regions with different nutrient input behaviours: the anticyclonic subtropical gyre, the cyclonic subpolar gyre, and between the two, the turbulent Gulf Stream region. Fig. 5 shows the limit of these regions, named POL (polar), MID (mid-latitudes) and TRO (subtropical gyre). The time evolution of the cumulated diffusive input (Fig. 6) sheds light on the processes involved. All regions exhibit two regimes during the year: a convective regime during winter followed by a diapycnal diffusion regime. The phase and duration of each regime depend on the region, but in each region more than 80% of the yearly input takes place during the convective period. In all regions, assimilation reduces the input compared to the FREE simulation, particularly in the MID region where the reduction attains more than 50%.

Advective input
It should be noted that in order to improve the clarity, all advective input fields have been smoothed with a 3 × 3 moving average. Again for the vertical advective input Q zadv (Fig. 7), the distribution obtained with the FREE simulation is similar to the one obtained by    Oschlies (2002c). The subtropical gyre centre is a weak downwelling area due to Ekman pumping, and the subpolar gyre centre is an upwelling area. In the high-variability Gulf Stream region, the sign of the input has a large spatial variability. It is also true along the southern border of the subtropical gyre. The input becomes positive in the West African upwelling region.
In the CORR simulation, distribution and magnitude of Q zadv are greatly modified. In the centre of the subtropical gyre, inputs are now weakly positive, and the latitudinal extension of the low-input centre region is reduced to 15-30°N. Along the southern boundary of the subtropical gyre, input magnitude has increased, and the West African upwelling shows a negative input along the coast. In this upwelling region, SST is overestimated in the CORR simulation. It explains why the upwelling is weakened. In the high-variability Gulf Stream region West of 50°W, positive inputs dominate. Along the 32°N path of the Azores current, positive advection spots have appeared. The subpolar gyre upwelling has nearly disappeared.
As Oschlies (2002c) noted, the horizontal advective inputs Q hadv (Fig. 8) are almost everywhere of opposite sign to the vertical inputs, because of the continuity equation. Hence, with assimilation the lateral inputs increase at the northern flank of the subtropical gyre, and progress toward the subtropical gyre centre.
The sum of the two advective components (Q zadv + Q hadv ) is displayed in Fig. 9. The overall impact of assimilation is a large increase of the advective inputs, particularly at the northern and southern flank of the subtropical gyre, and in the East of the North Atlantic Drift.
To summarise, assimilation enhances the upwelling of nutrient near the Gulf Stream path, and also enhances the horizontal transport toward the South and the East.
In addition to the nutrient input, an analysis of the upper 143 m horizontal nutrient transport across the limits of the three regions (not shown) was carried out. For the 50°N section (between MID and POL) transport shows a strong seasonality, approximately southward in winter and northward in summer, whereas for the 35°N section between MID and TRO, there is a permanent, large southward transport particularly in winter, to the profit of the subtropical gyre region. With assimilation, the transport toward POL (50°N) stays nearly identical, but the southward transport toward TRO (35°N) shows a large increase (reaching a threefold augmentation) during all the year.
The time evolution of the total advective input (Q zadv + Q hadv ) is displayed in Fig. 10. For 1 year, in the MID region CORR simulated input is multiplied by 4 compared to the FREE run, whereas in the TRO region, input is multiplied by 6.
The net impact (Fig. 11) is a threefold increase for the subtropical gyre region, and a moderate decrease (20% and 30% respectively) for MID and POL regions, but with large differences between the eastern and western part of the basin, and an increased contribution of the advective flux in the total (advective route provides 50% of the total in MID, and 75% in TRO). Such high increases of the advective inputs are unexpected and may be overestimated. To complete this view of the nutrient input, a comparison with observational estimates is needed.

Comparison with observations and other modelling studies
To assess the realism of the simulated input, a comparison with observation is needed. The main difficulties are that the nutrient inputs estimates from observations are sparse and depend on the estimation method, and that the simulated inputs vary by more than one order of magnitude depending on the place. Carbon inputs are converted to nitrogen inputs using a N/C ratio of 16:122.
Near year − 1 . The FREE simulation underestimate (0.24 molN m − 2 year − 1 ) this input whereas the CORR simulated input is high (0.70 molN m − 2 year − 1 ), but remains in the confidence interval. This increase with respect to the FREE simulation is mainly due to a larger vertical (+ 44 molN m − 2 year − 1 ) and horizontal (+ 6 molN m − 2 year − 1 ) transport, while input by vertical mixing is weakly reduced (−8 molN m − 2 year − 1 ).
In the eastern subtropical gyre, at 28,5°N, 23°W, Lewis et al. (1986) estimated a vertical diffusive input of 0.05 molN m − 2 year − 1 with a confidence interval of [0.001;0.32]. Both simulations discussed here are in the interval as Q zdif = 0.025 for FREE and 0.018 molN m − 2 year − 1 for CORR. Off Spain, data assimilation seems to worsen the results. In the POMME area (38-45°N, 22-16°W), Fernández et al. (2005) recently estimated a new production (equivalent to the total input, assuming all nutrients are converted to production) equal to 5,7 molC m − 2 year − 1 for year 2001. Averaged for this region, the simulated total nutrient input is only 0.871 molC m − 2 year − 1 for FREE and 0.027 molC m − 2 year − 1 for CORR. This underestimation is consistent with the too low chlorophyll concentration and primary production simulated in this region.
Along the subtropical gyre border, where inputs are greatly modified, few estimates are available. At EUMELI (21°N, 31°W) review). Here, the CORR total input is 0.18 molN m − 2 year − 1 , inside the observational bounds but is superior to the model estimates. However, the EUMELI site is not situated along the northern flank of the subtropical gyre where the maximum increase takes place.
Therefore, this comparison does not allow to conclude on the change brought by the assimilation, because of the large differences between the regions for the assimilation impact, and the large uncertainties in the observations.
Comparisons can also be made with input estimates from eddy-resolving modelling studies (Oschlies, 2002a;McGillicuddy et al., 2003), able to represent a realistic mesoscale activity. Along the northern flank of the subtropical gyre, the total advective input simulated here is very large compared to the Oschlies (2002a) estimates (from 0.3 to 0.9 molN m − 2 year − 1 for this study, versus 0.01 to 0.05 molN m − 2 year − 1 for Oschlies (2002a)). In the same region McGillicuddy et al. (2003) yield larger estimates (from 0.04 to 1 molN m − 2 year − 1 ). However, the McGillicuddy et al. (2003) study differs significantly from our study by the treatment of the remineralisation below the euphotic zone, possibly responsible for the overestimation of the input compared to Oschlies (2002a). Therefore, this suggests that along the northern flank of the subtropical gyre, the CORR simulation may overestimate the advective input, as long as the differences in the mean flow and in the atmospheric forcing can be neglected.
This increase of the advective input can be attributed to the eddy-pumping process (McGillicuddy et al., 1998), but may also be related with the misalignment of the biological and physical fields (Anderson et al., 2000), or to the physical adjustment after the correction.

The ecosystem response
As the simulated ecosystem is very sensitive to the representation of the physical environment, examining the ecosystem response informs on the physics. On the one hand, the good point is that more observations are available to be compared with, namely ocean colour data and station data: this is very important to assess the potential defects and qualities of the modified circulation. On the other hand, the main disadvantage of this approach is that the ecosystem model has its own defects which make the interpretation of the results more difficult.
To study the ecosystem response, three simple characteristics were examined. The first is the surface chlorophyll a (Chl) concentration during the spring bloom. The bloom is triggered when the mixed layer depth becomes inferior to the critical depth (Sverdrup, 1953). Observations and modelling studies including the present one show that the bloom has a northward progression, following the timing of the mixed layer shoaling. The magnitude of the bloom (i.e. the maximum Chl concentration) is largely determined by the surface nutrient concentration at the end of the winter period and hence by the mixed layer depth maximum.
Spring mean Chl concentrations are presented in Fig.  12. As the total input (Q zdif + Q zadv + Q hadv ) at the end of winter is reduced with assimilation (30% decrease in the MID region), surface nutrient concentrations (not shown) are reduced, and result in lower Chl concentrations during spring (0.4 mg m − 3 instead of 0.6 mg m − 3 in the MID region). However, the 0.05 mg m − 3 contour at the northern flank of the subtropical gyre is displaced southward, as a result of the higher nutrient inputs. Conversely, the 0.6 mg m − 3 contour is displaced northward, as the mixed layer increases abruptly. In the Gulf Stream region, the CORR Chl distribution shows meandering and eddies fingerprints, absent in the FREE simulation.
Despite the enhancement of the circulation temporal variability, the Chl concentration variability (not shown) is reduced in the MID region, and shows only a little increase in the subtropical gyre, much weaker than the observed variability. It means that in the model, the Chl temporal variability is dominated by the spring bloom, primarily determined by the winter diffusive input.
To compare the simulation with the Chl estimates from the SeaWiFS sensor, the main known flaws of our biogeochemical model have to be taken into account. To identify them, our model results for the FREE simulation are compared with those of the more complex model of Lima and Doney (2004), hereinafter LD04. The LD04 model includes two size classes for phytoplankton and detritus, growth limitation by silica, nitrate and ammonium, and multiple elemental pools.
First, in the coastal regions (e.g. off Newfoundland, in the European shelf seas) crucial processes are missing in both models, systematically leading to underestimated concentrations. Moreover, in our model, summer Chl concentrations are generally too low, because of a too strong coupling between grazing and phytoplankton production especially at high latitudes. Finally, both models underestimate the Chl concentration in the subtropical gyre centre, a recurrent phenomenon known as the "subtropical gyre desert" (Oschlies, 2002c). However, the underestimation of Chl and primary production in the subtropical gyre is limited for LD04. Chl stays larger than 0.019 mg m − 3 (0.001 mg m − 3 in the present study) and primary production is 50% of the VGPM estimate (10% here). Two main reasons can explain these improvements. First, for LD04 the remineralisation loop suspended detritus->ammonium is much faster than for P3ZD (0.2 day − 1 versus 0.04 day − 1 ). This loop allows sustaining a high level of primary production during the summer. Second, the silica-limited diatom compartment controls the spatial and temporal shift between small phytoplancton->suspended detritus->ammonium, and large phytoplancton->sinking detritus regimes. Such a model gives increased primary production compared to a single phytoplancton compartment model (Lima and Doney, 2004).
With assimilation, the simulated Chl concentrations stay overestimated (larger than 0.4 mg m − 3 ) in the region South of the mean Gulf Stream path. It is not surprising, as assimilation does not correct the mixed layer overestimation in this region. However, the position of the zonal separation between high Chl regions (North) and low Chl regions (South) is displaced northward by assimilation, and becomes closer to the observations. The second diagnostic is the time evolution of the simulated and SeaWiFS surface Chl concentration for the MID region, shown in Fig. 13. Simulated concentrations show a largest departure from the observations in spring. The bloom is too early in the FREE simulation (mid March instead of mid April), and the maximum Chl is overestimated by 45%, as a result of the overestimated winter nutrient input. Assimilation reduces the maximum concentration to a 30% overestimation, but does not significantly modify its date. The fact that assimilation does not modify the bloom date indicates that the major control of the surface stratification is the atmospheric forcing. During the remainder of the year, mean concentrations are close to the observations. In summer and fall, five concentration maxima occur, roughly at monthly frequency. It is not clear if they correspond to wind events, or to internal oscillations of the phytoplankton and zooplankton compartments.
The last diagnostic is the annual primary production (Fig. 14). This quantity is the sum of the new production and the regenerated production, therefore it is governed by both physical processes and biogeochemical processes (i.e. remineralisation). Moreover, it is vertically integrated and hence accounts for deep chlorophyll maximum, invisible in the ocean colour data. Here again, some intrinsic problems of the P3ZD model have been identified: in the subtropical gyre centre, primary production is too low by one order of magnitude. Another defect is that the primary production is too low North of 50°N, because of a too strong coupling between grazing and phytoplankton growth limited by temperature. However, at high latitudes the VGPM estimate is twice larger than another satellite-derived estimate (Antoine et al., 1996) which stays inferior to 150 gC m − 2 year − 1 . VGPM takes into account satellite-retrieved chlorophyll, surface temperature and photoperiod (Behrenfeld and Falkowski, 1997).
Assimilation increases moderately (12%) the primary production averaged for the region North of 30°N. This increase occurs mainly in the Gulf Stream region. Primary production also increases along the southern flank of the subtropical gyre.
In the Gulf Stream region, the front of the current separates on the southern side, the stratified and oligotrophic Sargasso Sea waters, and on the northern side the weakly stratified eutrophic coastal waters from the Labrador current. This front is visible on SST imagery and also on ocean colour data. The CORR simulated primary production is overestimated in the region South of the Gulf Stream path. However, there is a good agreement between the 150 gC m − 2 year − 1 contour and the mean Gulf Stream path, also present in the observational estimates from the VGPM (Berhenfeld and Falkowski, 1997).
This highly productive zone corresponds to a highly energetic zone, but is not visible on the advective nutrient inputs . Hence this production is the result of favourable conditions of light and nutrient availability, integrated by the coupled model during the year. This picture is a first step toward a realistic representation of the primary production in this energetic area.
The underestimation of the primary production in the subtropical gyre centre is reduced, but is still present in the assimilation experiment CORR. On average for the region TRO, the threefold increase of the nutrient input results in a twofold increase of the primary production, mainly located along the border of the gyre. However, in the subtropical gyre centre the primary production is still only 20% (10 versus 50 gC m − 2 year − 1 ) of the observational estimates of Berhenfeld and Falkowski (1997). Thus, it appears that such an enhanced input cannot bridge the gap between the observed and modelled rates of production in the subtropical gyre, as concluded Oschlies (2002c). This conclusion is reinforced by the fact that, contrary to Oschlies (2002c), the present ecosystem model has a DOM compartment, which can travel long distances and fuel the production in nutrient-depleted regions.
Even though the physical environment is still not perfect, improvements of the ecosystem response are visible. Nevertheless, drawbacks of the ecosystem model, and uncertainties in the observations do not allow to go further in the interpretation of the results.

Conclusion
An annual coupled experiment with an eddy permitting model of the North Atlantic has been conducted, assimilating SSH and SST data. The assimilation method is based on the CET03 configuration of the SEEK filter (Pham et al., 1998), modified in order to respect the hydrostatic stability, essential for the coupling. The results have been compared to a free simulation.
One purpose of this study was to control the physics of the upper layers by assimilation of surface only data. The present assimilation method is able to improve significantly the representation of the mixed layer depth in subtropics and mid-latitudes, and the mean and variance of surface currents.
The sensitivity of the biogeochemical model has been assessed, through the estimation of the nutrient input in the euphotic zone. This calculation quantifies the coupling between the physics and the ecosystem model. It is shown that the nutrient input is strongly affected by the data assimilation: diffusive input is halved in the MID region, while advective input is Fig. 14. Annual primary production for the FREE simulation (top), the CORR simulation (middle) and climatology estimated from the Vertically Generalised Production Model (Behrenfeld and Falkowski, 1997) interpolated on the model grid. The contour line of the mean position of the Gulf Stream estimated from simulation CORR (SSH = − 20 cm) is plotted on each plate. Units are gC m − 2 year − 1 . enhanced in the MID and TRO regions. The resulting total input is weakly decreased in the MID and POL regions, and increased (threefold) in the subtropical gyre region. This large increase can be partly attributed to the so-called eddy-pumping process (McGillicuddy et al., 1998). Another origin could be the misalignment between physical and biological fields (Anderson et al., 2000), or the adjustment processes of the corrected physical fields. Work is under way to identify the causes of this advective flux.
While nutrient input changed, the overall impact of the assimilation on the ecosystem is rather weak. Indeed, the relationship between primary production and nutrient supply is complex, as light and vertical mixing also limit the phytoplankton growth. Maximum Chl concentrations are reduced because of the reduced winter input. Primary production is enhanced in mid-latitudes and along the circumference of the subtropical gyre. The highest increase occurs in the Gulf Stream region, and leads to a better agreement with the observations. These results suggest some directions for future work. To better constrain the physical state, the observation system will be improved by the inclusion of subsurface information (e.g. T/S profiles). To improve the control of the stratification, correcting the surface forcing through assimilation seems to be a good idea, as forcing errors are known to be important. This way is presently explored with a continuous assimilation method. To improve the biological component, the biogeochemical model could be adapted (e.g. adding a temperature dependency for zooplankton growth, distinct size classes), and the model parameters could be optimised to fit the North Atlantic observations. Finally, as for the physical component of the system, the development of a biogeochemical state estimation method could be envisaged.