An Ensemble-Based Eddy and Spectral Analysis, With Application to the Gulf Stream

The “eddying” ocean, recognized for several decades, has been the focus of much observational and theoretical research. We here describe a generalization for the analysis of eddy energy, based on the use of ensembles, that addresses two key related issues: the definition of an “eddy” and the general computation of energy spectra. An ensemble identifies eddies as the unpredictable component of the flow, and permits the scale decomposition of their energy in inhomogeneous and non-stationary settings. We present two distinct, but equally valid, spectral estimates: one is similar to classical Fourier spectra, the other reminiscent of classical empirical orthogonal function analysis. Both satisfy Parseval's equality and thus can be interpreted as length-scale dependent energy decompositions. The issue of “tapering” or “windowing” of the data, used in traditional approaches, is also discussed. We apply the analyses to a mesoscale “resolving” (1/12°) ensemble of the separated North Atlantic Gulf Stream. Our results reveal highly anisotropic spectra in the Gulf Stream and zones of both agreement and disagreement with theoretically expected spectral shapes. In general, we find spectral slopes that fall off faster than the steepest slope expected from quasi-geostrophic theory. normally reconstruction illustrates the role played by the EOFs in correcting the ensemble mean toward the realized jet. This reconstruction heavily involves the coherent patterns of the zonal and meridional velocities.

and provides fundamental measures that should guide eddy parameterizations. At a deeper level, the shape of the spectrum can lead to important clues about dynamical processes controlling the eddy field, with wellknown examples being those for quasi-geostrophy (QG; Charney, 1971) and surface quasi-geostrophy (SQG; Held et al., 1995;Lapeyre & Klein, 2006). Indeed with the advent of global surface observations from satellites and eddy-resolving models, there has been an emphasis in the physical oceanographic community on quantifying the wavenumber spectral slopes of mesoscale eddies (Callies & Ferrari, 2013;Capet et al., 2008a;Khatri et al., 2018). The general understanding has been that the energetic western boundary current and Antarctic Circumpolar Current regions, are a mixture of QG, mixed-layer instabilities (MLIs) and internal waves while the quiescent regions are governed by SQG and frontogenesis (Cao et al., 2019;Dong et al., 2020;Khatri et al., 2021;Vergara et al., 2019;Xu & Fu, 2011, 2012.
The derivations of the QG and SQG spectral shapes rest on a number of assumptions that render the problem tractable. Amongst the most essential, in addition to QG, are those of statistical stationarity and spatial homogeneity. Implicit also are the assumptions that mean fields are irrelevant and forcing is constant.
These are constraints which clearly do not apply to most oceanographic settings. For example, the Gulf Stream possesses a very strong and structured mean flow to which the eddy field is sensitive. Any Gulf Stream mean flow also almost certainly represents a response to a temporally variable atmospheric forcing. Characterizations of this sort are not restricted to the Gulf Stream, but can arise in virtually any part of the ocean. Immediate connections to theoretical predictions are thus somewhat obscured, but one hopes that the predictions represent demonstrations of general statements which have been derived in special settings.
To test this idea requires calculations of spectra that are valid in the inhomogeneous and non-stationary regions of geophysical interest. Traditional spectra, in contrast, employ assumptions of stationarity and homogeneity and process the data prior to analysis in ways that distort the underlying structure and thus interfere with their ultimate interpretation. The results are therefore somewhat suspect in their validity. We argue analyzing ensembles of models permits calculation of spectra in non-stationary and inhomogeneous settings. Admittedly, modeling is special in that an ensemble can be built, as opposed to observations, where all that exists is the single, observed realization. A lofty, long-term goal is the development of ways to view the single, observed realization as a member of an ensemble, thus embedding its interpretation within an ensemble framework. This paper has two objectives. The first objective is to address the above issues within the context of numerical modeling by exploiting the relatively recent methodology of ocean ensemble generation. Two methods for spectral calculation are proposed that provide complementary views of the eddy energy field. Both satisfy Parseval's equality, and therefore can be interpreted as wavenumber dependent energy spectra. The first is a relatively straightforward generalization of classical spectral analysis, while the second is related to empirical orthogonal function (EOF) theory already widely used in oceanography. The latter technique has also been employed by the turbulence community (Berkooz et al., 1993;Lumley, 1970;Moser, 1994). These two techniques provide complementary views of the eddy energy field. The second objective is to apply the techniques to the separated Gulf Stream region as a proof of concept and to comment on the local eddy spectra. Accordingly, we find our results are somewhat at odds with previous spectral estimates in the Gulf Stream and we find departures from theoretical power law behaviors. The results also comment on the assumptions of isotropy that are often built into classical spectral analysis.
The next section introduces ensemble based mean and eddy definitions and reviews the basics of Fourier spectral analysis. We then describe our procedures for energy spectral computation and argue their roles as generalizations of the Fourier spectral approach to inhomogeneous and non-stationary settings. The application of these procedures to the separated Gulf Stream appears in Section 3 and we end with a brief summary and discussion of further applications and developments.
10.1029/2021MS002692 3 of 23 0.25° resolution. More recently, Jamet et al. (2019Jamet et al. ( , 2020) analyze a regional North Atlantic ensemble consisting of 60 members in various configurations at an eddying 1/12°, and Aoki et al. (2020) discuss an 80-member 1/36° eddy-resolving regional Kuroshio model. It is clear that oceanography is still in the early stages of exploiting ensemble methodologies, particularly at resolutions adequate to reliably host mesoscale eddy dynamics.

Ensemble Means Versus Classical Means
As emphasized above, the idea of an ensemble is rooted in numerical simulation, where collections of possible solutions of the governing equations can be analyzed. Observations, in contrast, are unique; no other observed ensemble members can be obtained. But, if numerical simulation is to be believed, observations are composed of both mean and eddy components, and represent a single realization of the dynamical state of the ocean. It is here that analyzing a numerical ensemble can assist in rationally decomposing observations into a "mean" flow with superposed "eddies." Consider a collection of spatially and temporally variable data, f i (x, t), where i denotes the ith member of a set of size N. Numerically, this collection will have been generated by solving the equations of motion subject to specified forcing, initial and boundary conditions. A common procedure for ensemble generation, used in this study, involves holding forcing and boundary conditions fixed and varying initial conditions. Given the chaotic nature of the fluid equations, different initial conditions develop into different flows that, due to their adherence to the equations, are nonetheless dynamically consistent states.
An ensemble mean can be formed via: representing that part of the original data present in all the members. We refer to this quantity as the "mean" flow. Since the collection possesses common forcing and boundary conditions, we will interpret the mean as reflecting the presence of those features throughout the domain. In this sense, the "mean" is the predictable, or reproducible, component of the flow.
The fluctuations about the mean for each member, ′ ( , ) = ( , ) − ⟨ ( , )⟩ , are dynamical contributions to the data not common across the members, and arise due to the differing initial conditions. The nonlinearity in the equations essentially assures us that small initial differences lead to large differences in dynamical state over relatively short times. Defining the "eddies" as ′ identifies them as the effectively unpredictable components of the flow. Of course, both the mean and the eddies have some dependence on the ensemble size, N, but presumably converge to unique statements as N grows.
In contrast, the means that can be formed from a single realization, such as, for a temporal mean, depend explicitly on the parameter T. The "eddies" associated with Equation 2 are obtained as the residuals about the mean and also reflect T. Neither the mean nor the eddies converge to unique statements as T grows. Since there is no basis for the choice of this parameter, both the mean and eddies are somewhat ambiguous. It is also common practice when analyzing observations to invoke an assumption of stationarity, or equivalently that the value of the absolute time t doesn't matter. This is a questionable assumption for most parts of the ocean. It is possible to partially remedy any such concerns by performing conditional averages, such as by averaging over the so-called DJF winter months for several consecutive years. But even then, the assumption is made that the value of the year is unimportant, which is again questionable given interannual variability.
These confounding issues do not plague ensembles, leading us to our first proposition that Equation 1 yields an unambiguous statement of what is meant by the mean flow and eddies. In this definition, it is recognized that any inhomogeneity or non-stationarity of the eddy field is captured.

Overview of Fourier Spectra
We briefly review classical spectral analysis focusing on spatial spectra, although similar techniques apply to frequency spectra.
In any practical situation, we have spatially finite data, u(x), over a domain A. The data can be forced into a periodic form by tapering it such that the data goes to zero at the domain edges, and then repeating the data indefinitely in space. A Fourier series representation for a purely periodic function always exists, and, calling uw the tapered data, is where x = (x, y) and L x , L y are the spatial periodicities of uw, such that, The quantities p, q are integers. For the remainder of the study, we only consider the eddy field (uw, u) obtained by removing the ensemble mean. The Fourier coefficients are given by: where A = L x L y is domain area.
We can form the product ; =̂;̂ * ; where the * denotes a complex conjugate. A fundamental idea behind spectral analysis is that the data u w is random, and hence that e w;n,m is also random, if non-negative. Averaging is required to arrive at an estimate of the underlying spectrum. The theory behind spectral analysis assumes that an ensemble average is performed. In practice, there are several techniques that are used for averaging. Standard techniques include averaging neighboring spectral estimates, or the original data is broken into several pieces and the spectral estimates at each wavenumber from the various areas are averaged. Note that in this process windowing necessarily impacts the domain scale spectral estimates. If the underlying fields are homogenous, the result is an estimate of an ensemble average. Denoting the average by brackets, 〈⋅〉 where ψ = (ψ, η) denote location. At this point, the assumption of spatial homogeneity is explicitly introduced by saying that the statistics of the field depend only on separation in each of the spatial dimensions, The Fourier transform of the two point correlation function ρ uu can now be computed and is usually written in the form, where the Fourier transform of ρ uu is written ; , and interpreted as (twice) the spectral energy density of the spatial series uw(x). This interpretation comes from Parseval's theorem, that is, as can be shown from Equation 6, arguing that the ensemble mean energy in domain A can be broken into contributions of energy ; in the waveband n, m. An additional assumption of isotropy is sometimes invoked, which further reduces the correlation function from ρ uu (|x − ψ|, |y − η|) to ρ uu (|x − ψ|). Well-known issues with this classical approach are that (a) in important parts of the ocean, like the separated Gulf Stream jet, the assumptions of homogeneity and isotropy are not justifiable, (b) the additional assumption of stationarity in the view of seasonality and intrinsic variability is suspect, and (c) the need to taper the data also complicates the understanding of the results.

The Classical Approach
Perhaps the simplest and most straightforward way to use ensembles to avoid the above mentioned issues is to replace the averaging methods in Equation 6 by an ensemble average, denoted by angle brackets (〈⋅〉). A somewhat subtler point is that this can be done on the Fourier transform of the original data, rather than the Fourier transform of the windowed data, where longer length scales have fewer degrees of freedom due to the windowing. (A broader discussion of the issues associated with windowing is provided in Appendix A.) Referring to the latter as , the spectral energy estimate becomes It is straightforward to show Equation 10 satisfies Parseval's equality based on the original data: rather than the windowed data. This permits the interpretation of ; as the wavenumber dependent decomposition of (twice) the ensemble mean kinetic energy of the eddies. In addition, the spectra belongs to the region A; that is, no assumptions of homogeneity are involved and the wavenumber decomposition is that of the domain.

An EOF Based Approach
A different, but equally valid EOF based decomposition was proposed by Moser (1994), who was interested in three-dimensional, inhomogeneous turbulence. Consider the integral equation where r ij = 〈u i (x)u j (x′)〉 is the two point, (x, x′), covariance matrix of a velocity field. The subscripts i, j track the velocity components, the brackets (〈⋅〉) again denote an ensemble average and repeated indices i, j imply summation. Equation 12 defines an eigenfunction/eigenvalue (ϕ i (x), λ) problem which, as pointed out by Berkooz et al. (1993), is a classic problem in the calculus of variations. Specifically, that problem is to find the (eigen) functions ϕ i , from the class of all functions, which are "most similar" to the velocities, , of all the ensemble members. The resulting decomposition into the set of functions ϕ i (x) arises in several branches of physics. In the turbulence community, this is known as the Proper Orthogonal, or Karhunen-Loeve, decomposition (Berkooz et al., 1993;Lumley, 1970); in oceanography, the eigenmodes are equivalent to EOFs (Preisendorfer & Mobley, 1988).
First, note if r ij is homogeneous, which for a finite sized observation set implies periodicity, then the choice ϕ j = e 2πik⋅x satisfies Equation 12. Note this is true for both components of the of the velocity field, hence the insensitivity of the eigenmode to the index j. Equivalently, Fourier modes act as the needed eigenfunctions in that case. Further, if the eigenfunctions are Fourier modes, r ij must be homogeneous (Berkooz et al., 1993), so Equation 12 defines Fourier modes as the proper expansion basis for homogeneous flows. The novelty of the analysis we present here comes when r ij is not homogeneous, in which case Equation 12 yields a more general spatial decomposition.
For simplicity of discussion and familiarity within oceanography, the integral equation in Equation 12 will be converted to a discrete form, where it becomes a matrix equation. Restricting the discussion from here on to two horizontal dimensions of ensemble and space, Equation 12 can be written where and To fit the form of Equation 13, the data array u ij composed of velocity values at spatial location (x i , y j ) multiplied by an area-like element (to account for the spherical coordinate in a consistent manner with MITgcm's finite volume discretization; Appendix C), is converted to a vector ↦ +( −1) of length n x n y where n x and n y are the number of observations in the zonal and meridional directions, respectively. The matrix [R] is of dimension (2n x n y ) × (2n x n y ) and [ϕ] is a vector of length 2n x n y , there being two velocity components. The matrix [R] is also symmetric, that is, R ij = R ji where i, j are row and column matrix locations, and therefore basic linear algebra assures us that [R] [ϕ], and λ possess a number of properties. Principle among these is that [R] can be diagonalized, that is, where [Φ] is a matrix whose ith column is the eigenfunction ϕ i and [Λ] is a diagonal matrix whose (i, i) component consists of the eigenvalue λ i associated with ϕ i . The notation [Φ] T denotes the transpose of the matrix [Φ]. Further the eigenfunctions are orthogonal and can be normalized, where I is the identity matrix and A is the domain area.
Another useful result of linear algebra is that the diagonalization in Equation 14 implies the traces of the matrices [R] and [Λ] must be identical. Noting that is twice the ensemble-mean eddy kinetic energy, the sum of the eigenvalues measures the mean energy of the samples. In more familiar EOF language, the eigenvalue represents the fraction of the observational variance captured by its associated mode. We remind the reader that while it is common to define the eddy velocity (u i ) as temporal anomalies in EOF analyses (e.g., Hannachi et al., 2007), here, we define it as the fluctuations about the ensemble mean.
The full set of eigenfunctions, ϕ k , is complete, so it is possible to represent each of the original ensemble members u n using the ϕ k as a basis, where the superscript n denotes the ensemble index, and k is the index of EOF mode. In principle, M = 2n x n y , as the matrix R is of size 2n x n y × 2n x n y . However, by appealing to singular value decomposition theory, the number of significant eigenvalues is set by the ensemble size. Using the orthogonality condition equivalently Equation 15 in its discretized form, with the spatial integrations equivalent to summing over discretized spatial points and A = ∫ x dx. Note, unlike the standard approach where each velocity component is transformed independently, the EOF procedure operates directly on the full velocity vector. A single set of expansion coefficients, a n;k , along with the vector-valued basis elements, = ( 1 , 2 ) exactly reconstructs both components of the velocity vector in any realization; the spatial distribution of u n comes from the first half of ϕ k (i.e., 1 ) and that of v n from the second half ( 2 ).
Inserting the decomposition given in Equation 17 into the definition of the ensemble-mean eddy kinetic energy and using the orthogonality of the EOF basis implies, In other words, the sum of the expected variance of the coefficients a n;k is also twice the ensemble-mean eddy kinetic energy (Moser, 1994). In as much as a n:k in the present case plays the same role as in a standard Fourier analysis, this completes the connection between the {ϕ} and Fourier bases. The expected value of the coefficient magnitudes can be related to the sample energy as in Equation 20, which is effectively a Parseval's equality. The difference here is the expansion basis is the vector valued set {ϕ}, rather than the independent complex exponentials employed by Fourier analysis. Note also that it has not been necessary to taper the data.
At this point, what remains is to assign a length scale to each member of the set {ϕ}. This is important for the application of the decomposition in tests of theory, as the equations of motion relate length scales, energy levels and parameters to expectations for spectral shape, as in Kolmogorov (1941). For Fourier modes, the length scale is obvious as the inverse of the wavenumber. Note, if ϕ is a Fourier complex exponential (viz. ϕ = e −ik⋅x ) Equation 21 can also be applied to the eigenfunction to generate a lengthscale. In this paper, we define zonal (k) and meridional (l) wavenumbers independently according to: and (recall that 1 ∫ 2 = 1 ). The subscripts x and y denote partial derivatives. While we acknowledge that there may be other ways to assign a length scale to the EOF modes, we argue that our definition is consistent with how Fourier length scales are defined and provides for a generalization of Fourier spectra (also see Appendix D).

Connections Between the Classical and EOF Based Spectra
The two techniques described above both decompose the same domain integrated kinetic energy, and so are related. To see this, note that the velocity realizations can be expressed in terms of Fourier transforms and EOFs as, Each ( ) can be itself Fourier analyzed leading to the result̂; By design, the EOFs produce the most compact representation possible of the kinetic energy field. This is inherent in their derivation as a solution to a maximization problem. EOFs are also aware of the statistical connections between the velocity components u and v as they are both in the covariance matrix [R]. Because each of the EOFs can be reconstructed from Fourier modes, they are composed of the contributions of the variance from the various wavenumbers which are needed to meet this maximally compact constraint. In this sense, the EOF decomposition provides a view of the energy field that is complementary to the classical Fourier spectrum.
To summarize, generalizations of spectral theory to non-homogeneous settings have been described. Given an ensemble of velocity realizations, averaging can be performed directly to arrive at classical Fourier based energy spectra, or EOFs of two-point correlation function [R], can be estimated and used to compute an associated energy decomposition. Neither procedure requires windowing, or tapering, the data (for further details, see Appendix A). Both procedures yield energy spectra satisfying Parseval's equality, so the energy structure as a function of length scale can be compared with theoretically expected slopes.

Application to the Separated Gulf Stream
An essential element of this analysis is the use of ensembles to define both eddies and perform needed averaging. We here use 36 realizations from a North Atlantic ensemble extending from 1963 to 1967. The model consists of a 1/12° deployment of the MITgcm (Marshall et al., 1997). The strategy used to produce the 36-member ensemble is described in Appendix B. We will pay particular attention to an area in the separated Gulf Stream located away from any topography. . The domain was chosen to focus on the separated Gulf Stream. Also, as we do not window the data prior to taking the Fourier transform (see Appendix A), we can examine the full domain size length scales. We will concentrate on the depths of 94 and 628 m, that is, near surface and middepth zones where spectral expectations differ. Figure 2 contains plots of ensemble mean zonal and meridional velocities at these depths. The near surface zonal flow is quite strong and is accompanied by a meridional flow weaker by a factor of four. Gulf Stream baroclinicity is evident, with a reduction of mean flow to roughly 0.3 m s −1 in the zonal direction at 628 m, and of order 0.05 m s −1 in the meridional direction. These fields define the inhomogeneous environment on which the eddy field develops. The mean flows appearing in Figure 2 are removed from all the 36 ensemble members, leaving us with effectively 35 degrees of freedom for the description of the eddy field.

Fourier Wavenumber Spectra
The result of computing the two dimensional, zonal, meridional energy spectrum according to Equation 10, and to a detrended version of the data, appear in Figures 3 and 4. (The method of detrending is detailed in Appendix A.) Figure 3 comes from the near surface, at a depth of 94 m and Figure 4 from 628 m, which is well within Figure 1. North Atlantic regional model domain. The contours are the ensemble-mean sea surface speed for 12 a.m., 1 January 1967. The white box encloses the region studied in this paper.
the main thermocline and away from surface influence. The left hand column shows the two-dimensional spectra and the right hand column one-dimensional projections of the two-dimensional spectra, along the zonal (blue) and meridional (green) directions. The upper row in both figures shows the spectrum obtained from the full data, and the lower row from a detrended version of the data. The solid black line is the result of azimuthally averaging the 2-D spectrum to produce an isotropic spectral estimate. The magenta line follows the k = l trajectory in the wavenumber plane.
The left-hand column illustrates that the spectra are strongly non-isotropic, suggesting that Gulf Stream structure is imprinting on the eddy field. Indeed, the primary spectral amplitudes occur along the zonal and meridional directions. The right-hand column shows interesting similarities and departures from the theoretically predicted spectral slopes of −5/3, −2, and −3. These slopes appear on the right as the dot-dashed gray lines and represent the expected slope of the upscale energy cascade (−5/3), frontal, MLI or internal waves (−2) and the enstrophy or SQG cascades (−3) (Cao et al., 2019;Charney, 1971;Dong et al., 2020;Held et al., 1995;Khatri et al., 2021;Vallis, 2017). The overall structure of the spectra compare well between the surface and the main thermocline in that the relative spectral slopes from the two depths are similar. The primary distinction lies in the spectral amplitudes, which are larger roughly by a factor of two near the surface, as expected.
The overall trend for all the one-dimensional projections is that they fall off at faster rates than the steepest −3 slope and tail off toward −5/3 at the largest wavenumbers, although the full data and detrended data differ in the details. For the full data, the slopes are steeper than −3 at wavenumbers smaller than ∼3 × 10 −2 cpkm but shoal toward −5/3 at larger wavenumbers. For the detrended data, the opposite is true: the slopes are steeper for wavenumbers larger than ∼10 −2 cpkm but roll-off toward −5/3 at smaller wavenumbers. The isotropized slope is essentially identical to the meridional projection, although it is clearly not representative of either the zonal or k = l projections. All slopes are generally different from those reported in Ajayi et al. (2021), all of which assumed isotropy and all of which were more characteristically between −2.5 and −3. Capet et al. (2008b) describe spectra with slopes of −2 in the California Current system. We note that the models examined in both of these studies were at much higher, sub-mesoscale permitting resolution, a feature which may well account for much of the difference. Our results do however question the interpretation of azimuthally averaged spectra in the Gulf Stream region of the ocean. We suspect this will be true also with sub-mesoscale resolutions in regions of strong mean flow.
The spectra appearing in the upper panels (Figures 3a, 3b, and 4b) are characterized by much larger amplitudes extending along the k = 0 and l = 0 axes relative to the lower panels (Figures 3c, 3d, and 4d). These ridges, while two to three orders of magnitude lower than the energies at low wavenumbers, are due mostly to the presence of the discontinuities at the zonal and meridional boundaries of the domain causing the slopes to shoal toward higher wavenumbers (cf. Figure A1). This is demonstrated by the lower plots, where spectra removing the discontinuities by means of detrending are shown. Note that the detrending has an unnoticeable effect on the energies at low wavenumbers, indicating that the largest spectral amplitudes are dominated by the structure interior to the domain. Anisotropy is evident in both plots. It is generally seen that the full spectra drop off somewhat more steeply than the detrended spectra, but in all cases the slopes are greater than −3 at scales O (100 km).
In summary, departures from theory appear in these spectral representations of the Gulf Stream eddy field. Perhaps clearest is the lack of spectral isotropy in the two-dimensional spectra. Beyond that, well resolved model length scales tend to exhibit steeper slopes than are predicted by theory.

An EOF-Based Spectral Examination
It is also possible to use the ensemble to estimate the correlation matrix from the domain in Figure 1 and to extract its eigenvalue/eigenfunction content. We show in Figure 5 the first two velocity eigenfunctions for 94 m depth (top row) and 628 m (bottom row) and third and fourth eigenfunctions in Figure 6. Perhaps the most prominent Note the units of spectra are all in spectral density. The 95% confidence intervals are shown in the colored shadings (see Appendix E for details). The dot-dashed gray lines on the right are slopes of −5/3, −2, and −3, which are the upscale energy, frontal and enstrophy or SQG cascade slopes expected from QG theory. The solid black line is the azimuthally averaged slope extracted from the two-dimensional spectra. The left column shows the spectra are highly anisotropic.
feature of the eigenfunctions is their similarity in structure even though separated by roughly 530 m in the vertical. Beyond that, first EOF resembles a standing eddy. Given that the mean flow exhibits a standing meander in the region, this mode represents strengthening or weakening of this feature. The second mode is much more zonally elongated, and shows reversals to the north and south of the primary flow at jet center. This mode captures the broadening or narrowing of the main Gulf Stream jet. The next two modes are more spatially complex, and difficult to interpret simply in terms of their effects on the Gulf Stream. However, both are amplified in the eastern sector of the domain, suggesting an increasing tendency for the Gulf Stream to become less coherent with increasing separation from the coast.
We next computed the modal spectra within the 4° × 4° square at the two depths of 94 and 628 m. The results appear in Figure 7 as three-dimensional scatterplots. The horizontal axes are the log of the wavenumbers and the vertical axis is the log of the spectral amplitude. Each eigenmode is assigned a single zonal and meridional wavenumber according to Equations 22 and 23, and the locations of those pairs are indicated by the blue dots. The spectral amplitude of the eigenmode for each wavenumber pair appears as the magenta crosses. The projections of the energies on the zonal and meridional axes appear as the green and red dots, respectively, and on those planes, black lines with a slope of −3 are given as reference.
As suggested by the solid black lines, the EOF spectra typically fall off more steeply than the −3 slope, which is consistent with the classical spectral results. There is no clear indication of the −5/3 slope at low wavenumbers hinted at in Figures 3 and 4. Beyond this, there are interesting comparisons to be made with the classical results, which underscore the different philosophies behind the two spectral estimates. The assignment of single zonal and meridional wavenumbers to each mode implies that the EOF spectral plot is really a single line in the three-dimensional wavenumber, energy diagram. This shows up in Figure 7 in the blue dot distribution in the k-l plane, which are the collection of the EOF "wavenumbers." The magenta dots, one per blue dot, describe a trajectory in the plot, nominally following the line described by the blue dots. Note that, although there is scatter, the spectral line is closer to an "isotropic" configuration than would be indicated by Figure 3. Best fit wavenumber lines suggest l = 1.4k near the surface and l = 1.1k at depth. Thus there is a weak indication of compression of the variability in the across-stream direction near the surface, and less so at depth. This is seemingly in keeping with the stronger down stream flow found near the surface relative to that at depth. In any case, the picture emerging from these plots is much different than the anisotropic configurations seen in Figures 3 and 4.
These very different spectral impressions can be understood by appealing to their derivations. The EOF spectra emerge from an examination of the two-point correlation function, Equation 12, which in turn contains information about statistical relationships between zonal and meridional velocity. In contrast, all such information is lost in the classical Fourier spectra, which consider the two velocity components independently. The spectra of the individual zonal and meridional velocity components from the detrended data appears in Figure 8, where the top row is from 94 m and the bottom row is from 628 m. The left column is the zonal velocity spectrum and the right column the meridional velocity spectrum. The sum of each row yields the full spectrum seen in the bottom rows of Figures 3 and 4. The zonal flow tends to contribute variance to the total near to the k = 0 axis, meaning this component of the flow exhibits long downstream lengthscales, and pronounced cross-stream structure. The opposite is true for the meridional velocity, which contributes heavily near to the l = 0 axis, indicative of the downstream structure of the cross-Gulf Stream eddy flow. When added, the spectrum that emerges is the anisotropic version observed in Figures 3 and 4. These various structures however, are not statistically independent, and when this is accounted for, by analyzing R (x, x′), the spectral representation falls along a line much closer to k = l. Recall that both energetic decompositions completely reconstruct the total kinetic energy in the domain, so both are equally valid descriptions. The EOF spectra, by combing through the wavenumber plane to combine the coherent variance at each wavenumber as determined by the statistics, provides a view of the eddies in a manner complementary to the classical spectra.
The statistical dependence of the u and v fields plays an important role in describing the regional variability. It is possible to partially reconstruct the velocity fields of the realizations by adding the contributions of the first few modes to the ensemble mean flow. Doing so demonstrates that the role played by the first few EOFs is to mimic the sinuous motion of the Gulf Stream within the jet. This is shown in Figure 9. Figure 9a is the 94 m velocity field of one of the ensemble members. Figure 9b is the ensemble mean velocity field. Clearly, the realization differs markedly from the ensemble mean, and so possesses a strong eddy field. On the other hand, it is clear the structure of the realization is dominated by the coherent Gulf Stream jet. The lower two rows (Figures 9c and 9d) show the effect of adding the first and second EOFs multiplied by the appropriate projection coefficient, respectively, to the ensemble mean. The comparison demonstrates that this partial reconstruction brings the fields much closer to the realization. Of course, this should be the result of adding the EOFs to the mean, but the comparison serves to illustrate that the regional variability is dominated by the vacillations of a highly coherent feature, and that the underlying regularity of the velocity field is captured by the EOFs. This statistical relationship between the velocity components is lost in the classical EOF energy spectrum.

Discussion and Summary
We illustrate the use of ensembles to compute ocean kinetic energy spectra, from the perspective of emphasizing how they readily permit the examination of spectra in the non-stationary, inhomogeneous and anisotropic settings of most oceanically interesting settings. This is primarily because the development of the ensemble permits the straightforward application of ensemble averaging to compute the mean fields, the "eddies" and the averages needed to obtain robust results. Amongst their advantages are an ability to define "parameter" free mean and eddy fields. It is also not necessary to invoke the assumptions normally used to generate spectra. We have also discussed variants on the normal spectral theme. In additional to the classical Fourier based decomposition which has seen wide-spread usage in oceanography, we have adopted a technique used in the turbulence literature for the study of inhomogeneous settings. In order to capture the inhomogeneous nature of oceanic flows, Sadek and Aluie (2018) recently developed a method where they spatially decompose the kinetic energy using a coarse-graining approach. Here, we have instead effectively migrated EOF analysis to energy by examining the two-point correlation matrix of the velocity field. The modal eigenvalues from the analysis play the role of the spectral amplitudes, but now describe the contribution of the EOF to the total energy, rather than that of a given complex exponential. The assignment of length scales to the energy is made via the gradients of the eigenmodes, a procedure that is itself a generalization of the Fourier technique. We also do not taper or window the data prior to analyzing it.
We apply the methods to the eddies to a section of the separated Gulf Stream, a strongly inhomogeneous and eddy rich area of the world ocean. The classical spectra argue strongly that the region is anisotropic, with energy predominantly found in the low velocity modes (Figure 8). The slopes of the spectra, when viewed from various one-dimensional perspectives, exhibit a number of novel characters. There is a suggestion at the lowest resolved wavenumbers of the −5/3 slope expected from inverse energy cascade arguments (Figure 3). Moving toward higher wavenumbers, however, the slopes fall off more steeply than the steepest (−3) slope expected from QG theory. This result differs from other analyses employing standard methods. We suggest these differences are at least partly due to the unique separation between mean and eddies that an ensemble permits.
We examine surface and mid-thermocline depths, finding the spectral structures strongly resemble each other. This appears in the similarity of the spatial EOFs as well as the distributions of the classical spectra. The energy levels are, of course, different, with the deeper level less energetic. The deeper level is, according to the EOF spectra, slightly less anisotropic than the surface, a result consistent with the weaker mean flow structure at depth.
The big distinction between the spectral views is in the apparent change between the highly anisotropic conditions suggested by the classical spectra and the much more isotropic looking spectral structure associated with the EOFs. Both techniques yield valid decompositions of the regional energy and each possesses its own strengths. Fourier analysis unambiguously assigns lengths scales to the spectral amplitudes and therefore can readily be used to investigate spectral slopes for comparison with theory. On the other hand, complex exponentials experience some degree of contamination due to the non-periodicity inherent in geophysical data. Length scales can also be assigned to EOFs via the procedure outlined in this paper, but the results tend to be less regular due to the complex spatial EOF structure. Testing for spectral slopes becomes much more subtle. The new information available to the EOFs, however, is found in the velocity cross-correlations (as opposed to in classical kinetic energy spectrum where such information is lost), which results in spatial structures reflecting the preferred modes of the data. These structures are well suited to the finite domain size and inhomogeneity of the data. In effect, the procedure illustrates how the energy is assigned across statistically coherent structures. Although this seems a valuable extension of classical Fourier analysis, in view of the broad usage of Fourier spectra in oceanography, we suggest the three views including the coarse-graining approach (Sadek & Aluie, 2018) provide complementary energy views.
Given these ensemble techniques, and with nesting technology their capacity for resolving the full time and space evolution of spectra, there are a dizzying number of avenues to pursue. We are particularly interested in comparing and contrasting the eddy fields of the ocean interior with those from the separated Gulf Stream. The latter is, as emphasized above, strongly inhomogeneous while the former region is more likely to meet conditions of isotropy. The regions are also starkly different in terms of their eddy energies. In view of these contrasts, we anticipate spectral distinctions will arise, with the interior likely to exhibit spectral structures like those anticipated from theory. We are also interested in exploring the departures of the Gulf Stream spectra from classical results. The presence of the mean flow and the departure of the region from quasi-geostrophic character are possible explanations, but which of these, if either, is dominant is not clear.

Appendix A: Windowing Considerations
It is standard practice when working with data of a finite extent, as is always the case, to window or taper the data, to avoid contaminating the spectra with edge effects. Literally, this implies multiplying the original data by a function that tapers the boundary values to zero so that the underlying data structure is consistent with the spatial periodicity of the Fourier functions, as used in Arbic et al. (2014), Uchida et al. (2017), and Ajayi et al. (2021). We note that this is not necessary for our empirical orthogonal function (EOF) spectral analysis: the two point correlation function can be computed for all points within the domain, and the eigenfunctions extracted, without further manipulating the data. We also note that in classical spectra, tapering distorts the underlying data thus effecting the outcomes of the computation. We do not taper, or window, the data in the classical way. The rational for that decision is discussed here.
Tapering is usually invoked to minimize edge effects occurring, for example, when an open ocean section of the Gulf Stream section is examined. Tapering the data to zeros at the domain edge then seamlessly permits the Fourier transforming of the data as if it were a purely periodic signal. In addition, it helps to control edge effects as an effect on averaging, a problem that we do not encounter as our averaging is a true ensemble averaging.
A well-known result of Fourier theory is that the transform of the tapered data is a convolution, that is, if the windowed data is, the transformed data iŝ( where ̂ is the Fourier transform of the window. The windowed transform, , is a weighted average of the underlying "true" transform. The artistry of window construction revolves around designing w such that its transform is reasonably narrow band, and dies off quickly with distance from the origin. It is impossible to avoid side lobes in ̂ , but a well-designed window hopefully minimizes the distortion of the underlying spectrum.
While accepting this premise, we point out two views, one mathematical and one conceptual, that suggest for many applications, windowing can usefully be avoided. First, one reason to taper is to avoid contamination from edge effects, a contamination that often falls under the title of Gibbs phenomena. This is a well-known problem with Fourier representations of discontinuous data that manifests in the unavoidable appearance of noise in the vicinity of the discontinuity. We do not dispute the reality of Gibbs phenomenon, but also point out that a discrete Fourier transform is entirely invertible. This implies that the discrete Fourier transform of a data stream of length N consists of a sequence of N/2 complex values whose inverse discrete Fourier transform returns the original data set to machine accuracy. If,̂= and no information is lost in the forward/backward transformation. An example appears in Figure A1, where the original data, appearing in the left panel, consists of a straight line, and thus possesses a large discontinuity at the edge. This input is forward and backward discrete Fourier transformed. The result appears in the lower left panel, and the difference of the two appears on the lower right. The difference is machine precision zero, which simply reflects the above argument.
It is also sometimes argued that the presence of the discontinuity will force the presence of high wavenumber content into the transform that is clearly not present. The top right panel shows the log plot of the transform of the original data, where it is seen that spectral variance is present throughout the wavenumber band, but that after the first few wavenumbers, the spectral amplitudes drop by four orders of magnitude. While this is the introduction of high wavenumber variance in what clearly is a very smooth function, the contamination by high wavenumbers is small.
A conceptual issue we raise with windowing is in its resultant filtered connection to the so-called underlying "true" spectrum. Indeed, the aim of window design is to give as pure a view of that spectrum as is possible with finite data. The issue as we see it here is that we are focusing, in a broad sense, on the ocean. The true underlying velocity transform in that case, brushing aside questions of domain irregularities and the like, would be the transform of the global ocean velocity field. This of course mixes the momentum of Pacific waters with those of the Indian and Southern oceans all together, which then presents one with the problem of interpreting what the results mean; a local signal in space is global in the wavenumber domain and visa versa.
If we move away from this rather abstract example to our present case of the North Atlantic, provided that we could somehow adequately isolate the North Atlantic and extract a Fourier transform of its velocity field, we would then be looking at the results of simultaneously considering all the differing regimes of the North Atlantic in a single transform. But the North Atlantic, by itself, still houses dramatically distinct dynamic regimes, from the chaotic and eddy rich Gulf Stream extension to the much more quiescent interior westward drift to the Loop Current dominated Gulf of Mexico (Jamet et al., 2021). In this case, the "true" underlying transform mixes all these various regions together to provide a single value for an amplitude at a given wavenumber.
If we consider the fundamental question driving our investigation in this paper, it is to study the spectral representation of the eddy field in the Gulf Stream extension, a region notable for its extraordinary behavior when compared to any other sector of the North Atlantic, and indeed to essentially all of the worlds oceans. One can rightly ask of what value is it when pursuing this question to extract from the regional data a look at the underlying "true" spectrum. In a sense this question is avoided if the transform of the raw, unwindowed data is used. In view of the invertibility of the Fourier transform, all of the wavenumber content of the regional data is perfectly contained in the raw transform, and full information about how that energy is structured in wavenumber space is available. In addition, it provides a product that is comparable and compatible to the EOF based decomposition which, by design, involves no windowing.
It is evident in Figures 3 and 4 that the energy lying along the k and l axes is one of the major anisotropic features of those spectra, even if their amplitudes are orders of magnitude smaller than the dominant spectral amplitudes at lower wavenumbers. In view of the contribution of the edge discontinuities to high wavenumbers inherent in Figure A1, it is fair to ask if the perception of anisotropy survives if the effects of the discontinuities are removed. We have therefore computed the regional spectra of so-called "detrended" data, where the detrending is carried out as described below.

A1. Detrending
First, consider the finite Fourier transform of a line, can be written̂= leading eventually tô= Given a non-periodic field f(x) inside a finite, square domain of zonal length L x and meridional length L y , we define a second field periodic in x, f 1 , according to: It is easily shown that, Next, we define a field f 2 (x) according to It is easily shown that, Eliminating f 1 from Equation A11 using Equation A9 yields eventually from which it can be shown that and 2( 0) = 2( ) (A15) so f 2 is a periodic function both zonally and meridionally. It is the spectra of the resulting f 2 field that we present in Figures 3 and 4. If one is interested in the total energy, one only needs to add the spectra of the other terms on the right-hand side of Equation A13.
where R ij = 〈u i (x)u j (x′)〉. The volume element on a sphere is where a is the Earth radius and θ, φ are latitude and longitude, respectively. The variable volume element is an issue with simply forming the covariance matrix and extracting the EOFs, as the various elements in Equation C1 are weighted differently according to the latitude. What follows is a proposed fix for this issue.
We will use the notation with the same for x′.
Breaking up the cos θ′ into the product of square roots and multiplying Equation C1 by The form of Equation C5 is the same as that of Equation C1 except that the weightings given all elements are the same. We can now form the new covariance matrix ρ ij a 2 dθdφ and extract its eigenvalues. We note that such consideration for varying latitude is often discarded in Fourier spectral analysis which assumes a local Cartesian plane.

Appendix D: Kinematic Tests of Empirical Length Scale Assignment for EOF Modes
In order to assess the our approach to assigning length scales to individual EOF modes, we apply the technique to kinematic fields with known properties. First, we consider ensembles of random super-positions of plane waves with a prescribed spectrum. Given a random selection of 20 wave number pairs (k x , k y ) with k x , k y ∈ [0, 10], we construct ensemble members from where the amplitude is given by ( ) = ( 0 ) 5∕6 with 2 = 2 + 2 and random phases drawn for each member.
Comparison of the given Fourier spectra at the prescribed wavenumbers and the spectra derived from EOF analysis is given in the first panel of Figure D1 for a 400 member ensemble of such fields. In this admittedly simple case, the ability of the EOF procedure to statistically identify the imposed length-scales is quite evident.
As a further test of the approach, we also consider fields that are decidedly not localized in Fourier space. Here we take ensembles of randomly located Gaussians where (ν p , η p ) are drawn independently for each member from uniform distributions. As usual, we consider fluctuations from the ensemble mean, u p (x, y) = v p (x, y) − 〈v p (x, y)〉.
Comparison between the observed Fourier spectrum (computed on 256 2 nodes) and the observed EOF spectrum for a 500 member ensemble of such fields is shown in the right panel of Figure D1. Here γ = 5 and, as expected, the Fourier spectrum decays considerably slower than the EOF spectrum. The EOF spectrum is nearly monotonically decreasing with the empirical length-scale. Both spectra agree on the wave-length of the most energetic mode.

Appendix E: Confidence Interval for Spectrum
In formulating the spectrum (E), we are averaging squared quantities where the quantity being squared has zero mean and are independent from one another. The zero mean is guaranteed as we have subtracted out the ensemble mean and non-linearity of the system ensures decorrelation amongst ensemble members. The distribution of the squared quantities, therefore, follows approximately a χ 2 distribution (Menke & Menke, 2016;Uchida et al., 2017). Under such distribution, the probability of our spectral estimate (E est ) falling close to the unknown "true" spectrum (E true ) is In other words, we can reject the null hypothesis that the true spectrum doesn't lie within the range of 2 ∕2 with a 1 − α significance level.

Data Availability Statement
The Fourier spectra were calculated using the xrft Python package . The MATLAB scripts and Jupyter Notebook used for analysis are available on GitHub with https://doi.org/10.5281/zenodo.5563893.