An introduction to mixing in a stably stratified fluid

. We provide an account of mixing in stably stratified fluids: we present the basic concepts of mixing, recall how mixing is quantified and analyze mixing in the stably stratified shear layer along these lines. We also discuss the importance of mixing in the ocean and describe how it is modeled in this context, focusing upon the commonest approaches.


Introduction
General circulation models of the ocean, which provide a numerical description of the mean currents at oceanic basin scales, use resolution of a few km along the horizontal and of tens-to-hundreds of meters along the vertical. It is therefore necessary to model the motions below the grid scale, the so-called subgrid-scale motions, at which kinetic energy is dissipated and mixing occurs. This modeling is crucial because the dissipating processes contribute fundamentally to the large scale dynamics of the ocean. Let us illustrate this important point through two arguments, related to the thermohaline circulation. The thermohaline circulation is a large scale convective motion formed by the sinking at high latitude of cold water masses; these cold water masses spread throughout the bottom of ocean basins and slowly upwell. This upwelling is due to mixing processes: without deep mixing, the ocean would have turned, within a few thousand years, into a stagnant pool of cold salty water extending to within one meter from the surface. The thermocline, which separate the bottom waters from the surface waters mixed by the wind, would have reduced to a very thin layer. This yields the fundamental idea that the thermohaline circulation is not driven by buoyancy effects but by the mechanical energy that mix the water masses, namely wind and tidal energy (see [15] and [24] and references therein).
A global estimate of the turbulent diffusivity associated with mixing in the ocean was proposed by Munk [23], by assuming a balance between the upwelling water masses and the downgradient mixing of heat. Using the density profile measured in the ocean below the thermocline and an estimate of the vertical velocity of upwelling from the distribution of chemical tracers, Munk [23] ob tained a value for the turbulent diffusivity equal to Kt � 10-4m 2 /s (that is, one thousand times the molecular diffusivity). The method used by Munk was subjected to many discussions (and some controversy) but Kt is now considered as the canonical value for the global diffusivity in the ocean.
As shown by Bryan [3] already in 1987, a constant diffusivity is certainly too crude an approximation for the parameterization of mixing in general cir culation models. A refined parameterization is needed, namely a simple relation between a turbulent diffusivity and a dynamical parameter characterizing the large resolved scales of the model (the same need arises for the turbulent viscos ity). This is still an open problem: for instance, Beckmann [2] provides results of three numerical simulations of the thermohaline circulation using general circu lation models differing by the representation of the bottom topography and the modeling of the subgrid scales. Currents for depths greater than 500 m are plot ted in Fig. 1: strong discrepancies between these large scale currents are visible from one model to the other, which illustrates their strong sensitivity to small scale processes.
It is necessary at this stage to provide a definition of mixing. Mixing is the destruction by molecular diffusion of density gradients generated by turbulence. Some authors (such as Toole [37]) also include in this definition of mixing the destruction of velocity gradients by viscosity. We shall rather keep the usual terminology of kinetic energy dissipation for the latter process. In the ocean, mixing takes place at scales of 1 mm to 1 m.
Mixing may also be referred to as vertical transport: because of the existence of the restoring force of buoyancy along the vertical direction, no net vertical transport can occur in a stably stratified fluid without mixing. By contrast, no restoring force exists along the horizontal direction (ignoring the Coriolis force), so that horizontal motions are not constrained. Horizontal transport thus leads to characteristic length scales that are much larger than those associated with vertical transport, because the former process is controlled by advective effects while the latter results from dissipating processes. Finally, we point out that mixing should be made distinct from stirring. Stirring results from the straining by large scale motions, which increases density gradients without mixing. An example of stirring is provided in Fig. 2.
This introduction mainly focused upon oceanic processes so that the active scalar field, which is mixed, is the density. The concept of mixing also applies in the atmosphere, by considering the potential temperature field in place of the density field. However, all concepts presented in the present paper rely upon the Boussinesq approximation so that they must be modified when compressible effects become important (see [1]).
The outline of the paper is the following. In the next section we present the basic concepts of mixing along with their modeling in the ocean. We provide an example of mixing analysis in section 3, by considering three-dimensional numerical simulations of the stably stratified shear layer (most results presented in this section stem from [32] and [33]. We also briefly address the question of parameterization of mixing resulting from shear instability, both in the numerical simulations and in large scale oceanic models. Conclusions are drawn in the final section.

Background and available potential energy
Energy is required to mix a stably stratified fluid. Even if we consider a fluid at rest, made of two layers of different density, the denser one being at the bottom, the interface mixes through molecular effects : the density gradient smooths, the source of energy being the internal energy of the fluid. The diff usion of The large scale vortex that develops from the Kelvin-Helmholtz instability induces a strain field which results in a thin layer of strong density gradient outside the vortex. The wrapping of dense and light fluid as the vortex forms creates intertwined thin layers of different density, thus yielding strong density gradients within the vortex. The large scale motion associated with the vortex mainly stirs the fluid and hardly mixes it (from (33]).
density at the interface occurs through a diapycnal fiux of mass ( i.e. across the constant density surfaces, or isopycnals). This is expressed by the standard diffusion equation:   is the local diapycnal mass fiux. This fiux results in an increase of the center of mass of the fiuid so that the potential energy of the fiuid Eb (usually defined per unit volume) increases as well. The expression for Eb is where V is the volume of the fiuid, assumed in the following to be contained in a fixed parallelepipedic rigid box, bounded by a surfaceS. The evolution equation for Eb is easily inferred from Eq. (1): K is the molecular diffusivity for density changes, g is the acceleration of gravity, H is the height of the fluid volume and ..::l p = Pb(O) -Pb(H) is the (positive) density difference between the bottom and the top of the container. �lam is the rate of production of the potential energy of the fluid at rest. Equivalently, �lam is the conversion of internal energy into potential energy. We shall refer to this rate as the laminar mixing rate for simplicity. In a moving fluid, mixing still occurs through a local diapycnal diffusive flux of mass, which we refer to as cf>d· The advective flux of mass gpu only distorts the isopycnals without changing their value. In order to access the diffusive mass flux c/>d, let us do a thought experiment: we assume that the fluid relaxes instanta neously toward a rest state through an adiabatic redistribution of density. Each fluid element is thus brought back to its equilibrium position without changing its density. This transformation amounts to stretching out each isopycnal hor izontally; it yields a stable density profile (denoted Pb again), whose potential energy (Eb) is the minimum potential energy of the fluid at that time. These concepts have been introduced by Lorenz [22]. Hence, the above situation of a fluid at rest is recovered instantaneously through the adiabatic transformation so that the same equations should hold formally: the changes in Pb are only due to the diapycnal mass flux c/>d apb a -+-cf> d=O.
at az (6) This result has been demonstrated rigorously by Winters & D'Asaro [39]. The expression for cf>d is: where <>I denotes an average along an isopycnal and p refers to the total density field. The equation for the background potential energy is analogous to Eq. (4) up to diffusive (Fb:i_!r�) and advective (F b �� � 1) mass fluxes across the bounding surface S (exact expressions of the latter fluxes are provided in [38]). As before, �d is the volume averaged value of cf>d times g: (9) and is the rate of production of the background potential energy. Hereafter, we simply refer to �d as the mixing rate. When the fluid is at rest, IV P I = -dpb/dz and �d reduces to �larn· In general �d � �lam (� 0), expressing the well-known fact that turbulent motions (more precisely: density fluctuations) increase mixing. Note that density fluctuations refer here to deviations about the background density field Pb· Lorenz' concepts were first applied by Thorpe [35] for the case of one-dimen sional density profiles measured in a lake: the idea was to sort out the elementary fluid volumes contributing to the fluid column by order of increasing density, so as to get an estimate of the vertical length of turbulence. Indeed, the sorting method only operates when the fluid has overturned, fluid volumes being left at their position when the fluid is locally stable. Thorpe recorded the distance each particle has to be displaced to reach its position in the stable profile and defined a length scale from the root mean square value of the distances. This length scale, now referred to as the Thorpe scale, is thus a measure of the vertical extent of overturning events in the original density profile. It also provides an estimate of the size of the turbulent eddies.
In a three-dimensional fluid volume, the use of an adiabatic transformation to analyse mixing has been performed only recently, starting with the work [38]; references are provided in the next section. The practical method to compute the stable density profile Pb is described in this section as well; we also show that Pb ( z) is distinct in general from a mean (horizontally averaged) density profile.
The adiabatic transformation that leads to Pb releases the energy associated with the density fluctuations. This energy is the available potential energy, in troduced by Lorenz [22] also: where Ep is the potential energy per unit volume of the fluid before the trans formation is applied. (Closed expressions for E a are provided by Holliday & Mcintyre [14]). The minimum potential energy state can also be characterized by Ea = 0.
The evolution equation for Ea is simply derived from that for Eb and Ep. We recall the evolution equation for the total potential energy per unit volume (see f.i. [38]) the first term on the right-hand-side is the volume averaged buoyancy flux, the second term is the advective mass flux through the boundaries, � l am is the laminar mixing rate and the last term is the mass flux across the boundaries due to molecular effects. Using (8), (10) and (11), one gets the evolution equation for Ea: (12) This equation shows that the dissipation rate of Ea within the fluid volume, �d -�lam, is the source for turbulent fluid mixing. The study of mixing thus amounts to decomposing the total potential energy Ep into a reversible part Ea, available for mixing, and an irreversible part Eb, which increases through mixing.

Quantification of mixing: diapycnal diffusivity and mixing efficiency
The rate of change of the background potential energy Eb per se does not provide any usable information on mixing. dE&/dt needs to be related to another quantity to provide some quantitative information. Equation (12) for Ea shows that the mixing rate due to density fluctuations, 4?d-4?lam, should rather be used in place of 4?d when mixing resulting from the flow dynamics is to be studied. 4?d-4'?t am is referred to as the turbulent mixing rate hereafter.
One common measure of mixing is provided by a flux Richardson number, R f &, which relates the rate of production of potential energy due to turbulent mixing within the fluid volume 4?d-4'?tam to the rate of energy input to the fluid M: When the fluid is not forced, the total sink of energy within the fluid volume is used in place of M. The energy is lost through mixing and through kinetic energy dissipation e so that, in this case, the flux Richardson number within the fluid volume is expressed as: The changes of energy from the beginning of the flow evolution may also be used in place of the instantaneous rates of change, leading to the definition of a global flux Richardson number : The interest in the latter definition is that it leads to a smoother evolution of the flux Richardson number and accounts for the boundary mass fluxes. Ek is the volume averaged kinetic energy. Note that either definition of the flux Richardson yields a value comprised between 0 and 1. A very small value (Ll Eb « LlEk) implies that very weak mixing occurs while a value close to 1 {Ll Ek « LlEb) implies that mixing is very efficient. In the latter case indeed, much more energy is lost into mixing than into heat. In the literature, the important concept of mixing efficiency is rather defined by the quantity " Mixing is also most commonly characterized by a turbulent diffusivity Kp When the scale of the background density gradient is much larger that the overturning scale (as measured by the Thorpe scale for instance), the turbulent flux is assumed to depend linearly on the gradient

{16)
the coefficient defining the turbulent diffusivity. The definition ensures that K&(z) is positive at any time and for any value of z. When the background density gradient is linear, a volume averaged turbulent diffusivity can be defined One important issue is the dependency of the turbulent diffusivity Kb upon the molecular diffusivity K. When. fluid motions are strictly horizontal, K, = K.
When fluid motions displace the isopycnals but remain laminar, as for internal gravity waves of low amplitude, K, is proportional to K. When fluid motions are turbulent, a prediction may be conjectured when K -+ 0, the Prandtl number being set to a constant value, from analogy with ordinary turbulence. In the latter case, when v -+ 0, the dissipation rate of kinetic energy remains finite and v-independant. The analogy suggests that, asK-+ 0, the mixing rate �d remains finite and K-independent, and so is the volume averaged value of K,, from Eq. (17). When the flow is not laminar and K is not small, no theory exists, to our knowledge, and one should rely upon (numerical or laboratory ) experiments to compute K,.

Characterization of mixing in geophysical ftuids: traditional approach
Mixing has been investigated for several decades in the ocean. As we discuss it in the present subsection, the major difficulty in such studies is the computa tion of the turbulent mixing rate (�d -�lam) so that models have always been used to estimate it. Hence, the exact computation of the mixing rate from the background density profile described above is a novel approach of mixing. In oceanic models, the turbulent mixing rate is usually estimated from the average value of the advective flux of density: the average being a temporal, a spatial or an ensemble one; it is denoted by an overbar in the following (thus p1 = pp, w 1 = ww, etc). The idea of taking the averaged advective density flux to estimate the turbulent mixing rate is that the average filters out the oscillations of the advective flux and yields the residual contribution of diffusive processes. Unlike in the atmosphere, the direct measurement of p 1 w 1 has been attempted only rarely in oceanography. The main reason is that, in addition to the smallness of this residual contribution, most instruments profile vertically in the ocean and therefore cannot measure vertical velocities (see, e.g., !11] and !37] for a review). Therefore the oceanic turbulent mixing rate is usually inferred from microstructure measurements (at scales of about 1 m) through statistical models. In this approach, the turbulent diffusivity is defined by where N2 = -(g/Po)d'"fi/dz is the Brunt-Viii sala frequency.
A now classical statistical model was proposed by Osborn & Cox [26]. It relates the turbulent mixing rate to the dissipation rate of the variance of the density fluctuations. Assuming that the mean density gradient is about constant over the integral scale of the turbulence and that a statistically stationary regime has been reached, the equation for the variance of the density fluctuations reduces to: ap This equation states that the production of fluctuating density variance by the action of the turbulent vertical flux of density p'w' on the mean gradient apfdz is balanced by the dissipation rate of this variance K IV' p' l2 . Using definition (19) for the turbulent diffusivity, Eq. (20) yields the expression: This relation was originally established by Osborn & Cox [26] for the variance of the temperature fluctuations. It can be expressed in terms of the density field, as done here, when the Boussinesq approximation is valid.
The same approach was used by Osborn [25] when the turbulence is sustained by a mean shear flow U(z) . Under the same assumption of a statistically steady state, the equation for the velocity variance is (written per unit mass): As above, this equation states that the production of turbulent kinetic energy per unit mass by the action of the Reynolds stress U!ii1 onto the mean velocity gradient dU f dz is balanced in the mean by the sinks of turbulent kinetic energy; these are the dissipation rate of kinetic energy € and and the turbulent mixing rate per unit mass gf p0p'w' (which increases the potential energy resulting from mixing). Introducing a flux Richardson number: Eq. (22) can be rewritten in a simplified way: Using the definition of the mixing efficiency 'Y = 1!!/i1, the turbulent mixing rate (per unit mass) becomes related to the dissipation rate of kinetic energy (instead of the dissipation rate of fluctuating density variance): g -p'w' = "'f€.
P o (25) The turbulent diffusivity within this model is thus defined by, using (19) and (25)   (26) From a practical point of view, € is measured from a single component, assuming isotropy (the validity of this assumption has been addressed in [40] and [36]). A constant value of 0.2 for 'Y is generally adopted. This value has been inferred from a conjecture by Osborn [25] (stemming from observations) that Rf � 0.15. Because the maximum value for Rf is not firmly established, an empirical value for 'Y equal to 0.2 is used. As pointed out by Caldwell & Mourn [4], 'Y is not con stant: values comprised between 0.05 and 0.48 have indeed been computed from microstructure measurements in the ocean (assuming K : = K �), which seem to depend upon the instability that creates the turbulence. In the atmosphere, 'Y displays the same variability about the 0.2 value. 3 Mixing in a stably stratified shear layer In section 2, we introduced basic concepts of mixing along with two estimators of mixing (the flux Richardson number and the turbulent diapycnal diffusivity). We now illustrate these theoretical points and consider for this purpose the stably stratified shear layer. Most of the results presented in this section stem from [32] and [33].

The stably stratified shear layer: a brief introduction
The shear layer is the simplest case of a inhomogeneous shear flow. It is formed by two layers of fluids having different densities and horizontal velocities. The interface between the layers, which thickens by molecular diffusion, enlarges most efficiently if the interface is unstable. A necessary condition for instability is the Richardson number It is therefore a two-dimensional instability. As the amplitude of the perturba tion grows, nonlinear effects gain importance and makes the instability saturate while the flow vorticity organizes into quasi-horizontal vortices (with axis par allel to the mean shear). The vortices are most unstable to a (two-dimensional again) subharmonic instability which makes them pair (see e.g. [8] and, for a re view, [13]). Secondary three-dimensional instabilities may also occur, at smaller scales than the two-dimensional instabilities. Further detail about the three dimensional instabilities in a stably stratified shear layer may be found in, e.g., [28), [16) or [5).

Mathematical model and numerical method
To investigate the shear layer dynamics and resulting mixing properties, we solve the three-dimensional Boussinesq equations: u = (u,v, w ) denotes the velocity field with u, v and w being respectively the streamwise, spanwise and vertical components; P and p are the dynamical pres sure deviation and the density deviation about a hydrostatic balanced state of reference density p0; vis the kinematic viscosity.
The initial velocity and density fields, U i (z) and Pi(z) respectively, are of the error function type, the thickness of the density profile being ffr larger than that of the velocity profile (Fig. 3). A two-dimensional perturbation is superim posed upon U i to destabilize the flow (the Kelvin-Helmholtz instability is thus able to grow). Three-dimensional perturbations are also added to create three dimensional motions. The boundary conditions are periodic along the streamwise (x) and spanwise (y) directions and of free slip type along the vertical (z).
The Boussinesq equations (28)(29)(30) are written in non dimensional form. We take half the thickness of the initial velocity profile as a length scale, which we denote as 8 i , half the velocity difference across the flow as a velocity scale (U) and half the density difference across the flow times /Pr, as a density scale 11 (Llpffr/2). The time unit is 8 i /U. This scaling sets three parameters J, Re and Pr, which control the fiow dynamics. They are defined by: A pseudo-spectral method is used to solve the Boussinesq equations: the equa tions are solved in Fourier space except for the nonlinear terms, which are com puted in physical space; aliasing errors are removed by a truncation scheme; symmetry conditions are imposed along the z-direction so that the fields can be expanded in cosine or sine series along this direction. A third-order Adams Bashforth scheme is used to advance the equations in time.
We have performed several runs in which J ranges from 0.04 to 0.16 and Pr has a value of 0.7 or 1.4. The resolution is ( Na:, Ny, N.,) = (128, 128, 257) and the domain size is (A, A, 2A), where A is the most amplified wavelength predicted by linear theory (A is nearly independent upon J, as long as J < 1/4 [12]). Hence, only one Kelvin-Helmholtz vortex can develop in the numerical domain. The Reynolds number Re :::: :: : 300. The approximate value for Re stems from the fact that a uniform but time varying viscosity is used. Indeed, in order not to have the flow dominated by both stratification and viscous effects at the end of the runs, the viscosity is allowed to vary in time according to the constraint that the Kolmogoroff scale 11 = (v 3 fe)114 remains equal to the grid size at any time.
Computing e (the dissipation rate of kinetic energy ) at each time step provides the value of v at that time. In practice, 11 smoothly decreases by a factor 3 from its initial value (equal to 1/300) up to the time e has reached a maximum value (the fiow has become strongly turbulent) and eventually reaches a value lower than 11(0) at the end of the run (equal to 1/500).
An example of the fiow development is displayed in Fig. 4. At t = 34 (Fig.  4a), the Kelvin-Helmholtz instability has grown and saturated: it is manifested as one Kelvin-Helmholtz vortex, already distorted by the growth of the three dimensional perturbation. The vortex is located in between sheets, which are the usual braids reinforced by the baroclinic torque (see [7], [30]). The saturation of the Kelvin-Helmholtz instability is followed by the growth of a three-dimensional secondary instability. Fig. 4b displays the total vorticity field at the time the dissipation rate of kinetic energy e is maximum: the three-dimensional instability has led to small scale structures, whose dominant vorticity component lies along the streamwise direction. This instability is thus of the Rayleigh-Taylor type, having occurred when heavy fluid lies over light fiuid ( [16] ) . The end of the run is visualized in Fig. 4c: the central part of the shear layer is relaminarizing and the turbulent activity, while decaying, is expulsed further outward of the central region.
In the following, the Kelvin-Helmholtz vortex will be referred to as a large scale structure, as opposed to the small scale three-dimensional structures. These adjectives may be quantified by the Reynolds number associated with each struc ture: Re:::: :: : 800 in the former case andRe:::: :: : 25 in the latter.

Analysis of mixing
The mixing properties of a stably stratified shear layer have been studied by a few authors. To our knowledge, the only laboratory experiments are those of Thorpe [34] and Koop & Browand [17]. All the other works are numerical ( [29], [9], [31], [32], [61). The method derived by Winters et al. [38] is used in the two latter papers only (we still note that background density profiles are computed in [29]).

• Computational method of the background density profile
The method to compute the background density profile Pb from the three dimensional density field is as follows. The fluid domain is partitioned into el ementary volumes of side equal to the grid size. Each elementary volume rep resents one fluid particle. These fluid particles are next sorted according to the value of their density. Two equivalent methods can be used to. construct the stable density profile. In the first method ( [38], [32]), the profile is constructed by filling the successive horizontal planes with the ordered particles, the bottom plane being filled with the heaviest particles. The value of the density slightly varies from particle to particle in a. given plane (by a few percents at most in our computations) so that the value of the background density at each plane is computed by a horizontal average. In the second method ( [6]), each particle is stretched horizontally so as to fill successively each horizontal plane; no horizon tal average is therefore required. In numerical computations however, the density field needs to be interpolated if combined with other quantities, to match the vertical resolution.

• Horizontally averaged and background density profiles
The horizontally averaged and background density profiles, p(z) and Pb(z) respectively, are plotted in Fig. 5 at the same key times as in Fig. 4, but for a different run (see caption). Figure (Fig. 5c), when the flow starts to relaminarize in its central part and turbulence decays in its outer part, do both profiles coincide. They have nearly become linear and strongly depart from the initial profile.
• Turbulent mixing rate The temporal evolution of the turbulent mixing rate �d-�lam offers a syn thetic view of the respective contribution to mixing by the Kelvin-Helmholtz vortex and the three-dimensional structures. Fig. 6 thus shows that �d-�lam maximum which is nearly three times higher than the relative maximum (note that � d/�tam � 20 when these maxima are reached).

• Mixing efficiency
The previous results demonstrate that mixing is promoted by small scale three-dimensional structures, and not by a large scale Kelvin-Helmholtz vortex. But what about the efficiency of mixing? To address this important point, and in order to interpret it properly, we first present in Fig. 7 the dissipation rate of kinetic energy € for the same run as before. The behavior of € is not surprising: it remains close to its initial value when the Kelvin-Helmholtz instability grows and saturates, because this instability, and the associated Kelvin-Helmholtz vortex, are two-dimensional. As is well-known (e.g. [20]), the material conservation of vorticity in a two-dimensional flow prevents energy from being transferred toward small scales, where dissipation is acting. By contrast, Fig. 7 shows that the growth of a three-dimensional instability, and its nonlinear manifestation as small scale structures, makes € strongly increase, up to a value equal to nearly 20 times its initial value. The behavior of the mixing efficiency or, equivalently, of the flux Richardson number R/b can be immediately inferred from Figs. 6 and 7. Figure 8, where R/b is displayed versus time, thus shows that the efficiency of mixing is high (close to 0. 7) during the initial two-dimensional stage of the flow and moderate (close to 0.3) when three-dimensional secondary structures have developed. In other words, the large scale Kelvin-Helmholtz vortex contributes weakly to mixing, but in an efficient manner (because nearly no energy is dissipated) while the three dimensional structures are responsible for the mixing of the fluid, but much less efficiently (because they occur at scales close to dissipative scales). Hence, the Kelvin-Helmholtz vortex mainly stirs the fluid while the three-dimensional structures effectively mix it. Pr (equal to 0.7 or 1.4) (from [33]).
• A parameterization of mixing: turbulent diffusivity versus Richardson number As said in the Introduction, the parameterization of mixing consists of a func tional dependency of the diapycnal diffusivity Kb (defined by Eq. (16)) upon a dynamical parameter characterizing the largest scales; in a general circulation model, the latter scales belong to the resolved scales. Such a functional depen dency may be seeked for in the numerical simulations of a stably stratified shear layer we presented because the largest scale is a shear flow (that is, with no horizontal scale) and is well separated from the scale of the three-dimensional motions that mix the fluid. One may thus assume the former motion to be as sociated with a current that would lie among the resolved scales in an oceanic circulation model while the latter motions are at subgrid scale.
The presence of a mean shear implies that a Richardson number should be taken for the dynamical parameter. We make use of the monotonicity of the background density profile Pb to introduce a Richardson number that remains one-signed at any time: The normalized diapycnal diffusivity Kb(z)/K is plotted as a function of Rib(z) in Fig. 9a, when the turbulence has started to decay (for t ?: 196; no clear dependency could be found before this time). Two parts may be distinguished in Fig. 9a, where a log-log scale is used: for Rib < 0.2, Kb(z)/K decays quasi linearly with Rib(z). This decaying behavior is expected: when Rib decreases, stratification effects weaken relative to nonlinear effects so that the flow becomes more turbulent and mixing increases. This part of the curve is contributed by the outer region of the shear layer, where turbulence remains. Figure 9a shows that values of the turbulent diffusivity Kb up to 200 times the molecular diffusivity are found in this region. For Rib(z) > 0.2 by contrast, that is, in the central relaminarizing part of the shear layer, no clear dependency is observed. The quasi-linear dependency of Kb(z)/K with Rib(z) displays a -2 slope. This is confirmed in Fig. 9b, where Kb(z)/K. is plotted versus 1/Ri�(z): the lin ear dependency is recovered but the slope varies in time. In the ocean, below the upper mixed layer, several parameterizations are used depending upon the pro cess that creates the turbulence: shear instability, internal gravity wave breaking and double diffusion (due to the simultaneous diffusion of heat and salt). For the former process, a parameterization of the form K/ Kmax = [1-( Ri/ Ri0)2]3 is commonly used in large scale circulation models for 0 < Ri < Rio, with Kmax = 5 10-3 m 2 /sand Rio = 0.7. The high value of Rio stems from the fact that the measurements of Ri rarely fall as low as the theoretical value of 1/4. According to Large et al. [18] , this parameterization is supported by lim ited observations ( [27]). Our numerical results suggest that this is a too sharp dependency.

Conclusion
As discussed in this article, basic concepts of mixing have been known for a long time (dating from Lorenz [22]) and are now properly formalized. The parame terization of mixing still remains a difficult problem. Parameterizing implies a fundamental understanding between mixing and flow dynamics. The existence of a link between mixing, which occurs at dissipative scales and the flow dynamics, whose energy reservoir is at large scale, is not surprising: mixing is controlled by these large scales, as is kinetic energy dissipation in (three-dimensional) or dinary turbulence. In the latter case, the problem is simplified by the fact that, at high enough Reynolds number, the kinetic energy dissipation rate depends only upon the rate of input of energy at large scale, and not upon the details of forcing (assumed to be isotropic). This is not true when a stable stratification exists: the memory of forcing is all the more important the higher the strat ification is. The details of forcing is most often associated with an instability that triggers the (kinetic and potential) energy transfer toward small scale. To derive the parameterization, the source of the instability (an unstable shear flow for instance) should rather be considered. Computing a dynamical parameter from the characteristics of this source may indeed lead to a more general behav ior than considering the instability itself, as is commonly done. This approach would also provide a usable parameterization of mixing, because the source of turbulence belongs to the resolved scales in large scale circulation models, while the instability that leads to fluid mixing does not in general.