Random cascades on wavelet dyadic trees

We introduce a new class of random fractal functions using the orthogonal wavelet transform. These functions are built recursively in the space-scale half-plane of the orthogonal wavelet transform, “cascading” from an arbitrary given large scale towards small scales. To each random fractal function corresponds a random cascading process (referred to as a W-cascade) on the dyadic tree of its orthogonal wavelet coefficients. We discuss the convergence of these cascades and the regularity of the so-obtained random functions by studying the support of their singularity spectra. Then, we show that very different statistical quantities such as correlation functions on the wavelet coefficients or the wavelet-based multifractal formalism partition functions can be used to characterize very precisely the underlying cascading process. We illustrate all our results on various numerical examples.


I. INTRODUCTION
Fractal and multifractal concepts 1-3 are now widely used to characterize multiscale phenomena that occur in various situations in physics, chemistry, geology and biology. [4][5][6][7][8][9][10][11][12][13] The multifractal formalism has been originally established to account for the statistical scaling properties of singular measures. 2,3,[14][15][16][17][18][19][20][21] This formalism lies upon the determination of the so-called f (␣) singularity spectrum 2 which quantifies the relative contribution of each singularity of the measure: Let S ␣ be the subset of points x where the measure of an ⑀-box B x (⑀), centered at x, scales like (B x (⑀))ϳ⑀ ␣ in the limit ⑀→0 ϩ , then by definition, f (␣)ϭdim H (S ␣ ) is the Hausdorffdimension of S ␣ . Actually, there exists a deep analogy that links the multifractal formalism with that of statistical thermodynamics. [22][23][24] This analogy provides a natural connection between the f (␣) spectrum and a directly observable spectrum (q) defined from the power-law behavior, in the limit ⑀→0 ϩ , of the partition function 2,25 Z q (⑀)ϭ ͚ i (B i (⑀)) q ϳ⑀ (q) , where the sum is taken over a partition of the support of the singular measure into boxes of size ⑀. The variables q and (q) play the same role as the inverse of temperature and the free energy in thermodynamics, while the Legendre transform f (␣)ϭmin q (q␣Ϫ(q)) indicates that, instead of energy and entropy, we have ␣ and f (␣) as the thermodynamical variables conjugate to q and (q), respectively. 2,[16][17][18]26 Let us recall that this thermodynamic multifractal formalism has been worked out in mathematics in the context of dynamical system theory. [22][23][24] However, rigorous proof of the above connection has been made only on some restricted classes of singular measures, e.g., invariant measures of some expanding Markov maps ͑''cookie-cutter'' Cantor sets͒ on an interval or a circle, 17,21 the invariant measure associated to the dynamical systems for perioddoubling and for critical circle mappings with golden rotation number. 17 It has been developed into a powerful technique accessible also to experimentalists. Successful applications have been reported for multifractal measures which appear beyond the scope of dynamical systems. 6,10 Although valid for deterministic multifractals only, this description has been mainly applied for characterizing stochastic systems. But there is no reason, a priori, that all the realizations of the same stochastic multifractal measure correspond to a unique f (␣)-curve. Each realization has its own unique distribution of singularities and one crucial issue is to relate these distributions to some averaged versions computed experimentally. As emphasized in Ref. 27, one can take further advantage of the analogy with the thermodynamic formalism by using methods created specifically to study disorder in spin-glass theory. 28 When carrying out replica averages of the random partition function associated with a stochastic measure, one gets multifractal spectra (q,n) that generally depend on the number of members n in the replica average chosen ͑let us note that n ϭ0 and nϭ1 correspond, respectively, to commonly used quenched and annealed averaging͒. 27 Then by Legendre transforming (q,n), some type of averaged f (␣) spectra are being found. 27 Some care is thus required when interpreting these average spectra in order to avoid some misunderstanding of the underlying physics.
Multiplicative cascade models have enjoyed increasing interest in recent years as the paradigm of multifractal objects. [1][2][3]27,29 The notion of cascade actually refers to a self-similar process whose properties are defined multiplicatively from coarse to fine scales. In that respect, it occupies a central place in the statistical theory of turbulence. 13,29 Since Richardson's famous poem, 31 the turbulent cascade picture has been often invoked to account for the intermittency phenomenon observed in fully developed turbulent flows: 29,30 Energy is transferred from large eddies down to small scales ͑where it is dissipated͒ through a cascade process in which the transfer rate at a given scale is not spatially homogeneous, as supposed in the theory developed by Kolmogorov in 1941, 32 but undergoes local intermittent fluctuations. 13 Over the past 30 years, refined models including the log-normal model of Kolmogorov 33 and Obukhov, 34 multiplicative hierarchical cascade models like the random ␤-model, the ␣-model, the p-model ͑for a review see Ref. 29͒, the log-stable models [35][36][37] and more recently the log-infinitely divisible cascade models [38][39][40][41] with the rather popular log-Poisson model advocated by She and Leveque, 42 have grown in the literature as reasonable models to mimic the energy cascading process in turbulent flows. On a very general ground, a self-similar cascade is defined by the way the scales are refined and the statistics of the multiplicative factors at each step of the process. 27,29,37 One can thus distinguish discrete cascades that involve discrete scale ratios leading to log-periodic corrections to scaling ͑discrete scale invariance 43 ͒, from continuous cascades without preferable scale factors ͑continuous scale invari-ance͒. As far as the fragmentation process is concerned, one can specify whether some conservation laws are operating or not; 27 in particular one can discriminate between conservative ͑the measure is conserved at each cascade step͒ and nonconservative ͑only some fraction of the measure is transferred at each step͒ cascades. More fundamentally, there are two main classes of self-similar cascade processes: deterministic cascades that generally correspond to solvable models and random cascades that are likely to provide more realistic models but for which some theoretical care is required as far as their multifractal limit and some basic multifractal properties ͑including multifractal phase transitions͒ are concerned. 27 As a notable member of the later class, the independent random cascades, introduced by Mandelbrot ͑commonly called M-cascades͒ 30,44 as a general model of random curdling in fully developed turbulence, have a special status since they are the main cascade model for which deep mathematical results have been obtained. 45,46 However, in physics as well as in other applied sciences, fractals appear not only as singular measures, but also as singular functions. 1,[4][5][6][7][8][9][10][11][12][13] In order to stay in the context of fully developed turbulence, directly observable quantities are the velocity field or the temperature field rather than the dissipation field. 13,47 A classical way of analyzing the intermittent character of turbulent velocity signals consists in calculating the moments S p (l)ϭ͗␦v l p ͘ϳl p of the probability density function of longitudinal velocity increments ␦v l (x)ϭv(xϩl)Ϫv(x) over inertial separation l. 13,47,48 As originally prompted by Frisch and Parisi, 49 by Legendre transforming the scaling exponents p of the structure functions S p , one expects to get the Hausdorff dimension D(h) ϭmin p (phϪ p ϩ1) of the subset of R for which the velocity increments behave as ␦v l ϳl h . In a more general context, D(h) will be defined as the spectrum of Hölder exponents of the signal under study and thus will have a similar status than the f (␣)-singularity spectrum for singular measures. Unfortunately, as pointed out by Muzy et al., 50 there are some fundamental limitations to the structure function approach which intrinsically fails to fully characterize the D(h) singularity spectrum. In previous work, [51][52][53][54] we have shown that there exists a natural way of performing a multifractal analysis of fractal functions which consists of using the continuous wavelet transform. [55][56][57] By using wavelets instead of boxes, like in the classical multifractal formalism, 2 one can take advantage of the freedom in the choice of these ''generalized oscillating boxes'' to get rid of possible smooth behavior that could mask singularities or perturb the estimation of their strength h. 52,53 The other fundamental advantage of using wavelets is that the skeleton defined by the wavelet transform modulus maxima ͑WTMM͒ provides an adaptative space-scale partitioning from which one can extract the D(h) singularity spectrum via the scaling exponents (q) of some partition functions defined on the skeleton. The so-called WTMM method [50][51][52][53][54] therefore gives access to the entire D(h) spectrum via the usual Legendre transform D(h)ϭmin q (qhϪ(q)). We refer the reader to Refs. 52,58 for rigorous mathematical results. Let us mention that for the same reasons previously raised for stochastic multifractal measures, the theoretical treatment of random multifractal functions requires special attention. Let us also note that in a more recent work, [59][60][61] we have further generalized the WTMM multifractal formalism in order to incorporate in this statistical description ͑which applies for cusp-like singularities only͒ the possible existence of oscillating singularities. This new ''grand canonical'' description allows us to get the singularity spectrum D(h,␤) which accounts for the statistical contribution of singularities of Hölder exponent h and oscillation exponent ␤ ͑where ␤ characterizes the local power-law divergence of the instantaneous frequency͒.
Beyond the multifractal description, there is, however, the practical issue of defining in any concrete way how to build a multifractal function. Schertzer and Lovejoy 35 suggested a simple power-law filtering ͑fractional integration͒ of singular cascade measure as a mean to stochastically simulate fields reminiscent of passive scalars in turbulence. In the same spirit, the bounded cascade model of Marshak et al. 62 consists in acting on the multiplicative weights during the cascade in physical space. In Ref. 63, the midpoint displacement technique for building fractional Brownian motions was generalized to generate deterministic or random multiaffine functions. The same goal was achieved in Refs. 52,53 by combining fractional or ordinary integration with signed measures obtained by recursive cascade like procedures. Several other attempts to simulate ''synthetic turbulence'' that shares the intermittency properties of turbulent velocity data have partially succeeded. [64][65][66] More recently, the concept of self-similar cascades leading to multifractal measures has been generalized to the construction of scale-invariant signals using orthonormal wavelet basis. [67][68][69][70] Instead of redistributing the measure over sub-intervals with multiplicative weights, one allocates the wavelet coefficients in a multiplicative way on the dyadic grid. This method allows us to generate multifractal functions from a given deterministic or probabilistic multiplicative process. The main goal of this paper is to provide some mathematical framework to random W -cascades on wavelet dyadic trees. [67][68][69][70] The paper is organized as follows. In Secs. II and III, we explain how the W -cascades are built using a wavelet orthogonal basis and we characterize the regularity properties of their corresponding random fractal functions by studying the support of their singularity spectrum. This support is linked to the statistical spectrum obtained with the wavelet based multifractal formalism. [50][51][52][53][54] The self-similarity kernel 41,68-72 which, from a statistical point of view, characterizes the self-similarity properties of a cascade process ͑in a different way from the multifractal formalisms͒ is introduced in Sec. IV. In Sec. V, we compute explicitly the correlation function of two wavelet coefficients of a W -cascade. 73 It is proved to follow a power-law behavior when varying the spatial distance of the two coefficients. The statistical spectrum, the self-similarity kernel as well as the correlation function are shown to be numerically well estimated directly on the fractal function, using its wavelet decomposition ͑continuous, orthogonal or its associated extrema representation͒ with an arbitrary analyzing wavelet. All these results are illustrated on various computer generated numerical signals.

A. The periodic wavelet orthogonal decomposition
As mentionned in the introduction, a W -cascade [67][68][69][70] is built recursively on the dyadic grid of the orthogonal wavelet transform, [55][56][57] involving only scales that range between a given large scale L and the scale 0 ͑excluded͒. Thus the corresponding fractal function f (x) does not involve scales greater than L. We can thus consider, for the sake of simplicity, that f (x) is a periodic function of period L. In the following we will choose Lϭ1. The W -cascade will then be defined using a periodic orthonormal wavelet basis 74 of L per 2 (͓0,1͔), i.e., the space of 1-periodic functions with finite energy.
Such a basis can be constructed using two functions (x) and (x) of L per 2 (͓0,1͔) ( is referred to as the analyzing wavelet͒ by means of translations and dilations of (x) j,k ϭ2 j/2 ͑2 j xϪk ͒, jу0, 0рkϽ2 jϪ1 . ͑1͒ One can prove [55][56][57]74 that the so-obtained family of functions ͕(x),͕ j,k ͖ j,k ͖ is an orthonormal basis of L per 2 (͓0,1͔) if and satisfy some conditions. Among these conditions, (x) should be localized around 0 and have N (у1) vanishing moments The wavelet coefficients ͕c ,͕c j,k ͖ j,k ͖ of a function f (x) are then defined ͑modulo a normalization factor͒ as the coefficients of f in the orthonormal wavelet basis

͑3͒
Remark: Let us note that the usual definition of the wavelet coefficients does not involve any normalization factor. [55][56][57] However, as we will see in Sec. III, the normalization factor 2 j/2 has been introduced so that the Lipschitz exponent can be directly deduced from the power-law behavior of the coefficients ͕c j,k ͖ j,k .
Since ͕(x),͕ j,k ͖ j,k ͖ is an orthonormal basis, one gets the reconstruction formula On the one hand, let us note that, since all the j,k have at least one vanishing moment, c essentially ''captures'' the mean value of f . This explains why it is often referred to as the approximation coefficient. [55][56][57] On the other hand, assuming that the scale 1 ''corresponds'' to (x), one can easily prove that j,k (x) is localized around xϭx j,k and corresponds to the scale a j with x j,k ϭ2 Ϫ j k and a j ϭ2 Ϫ j . ͑5͒ Therefore, c j,k essentially captures the details of f (x) around the point x j,k and at the scale a j . They will be referred to as the detail coefficients. [55][56][57] As displayed in Fig. 1, these coefficients lie on a dyadic grid in the space-scale half-plane.

B. Building a W -cascade
In this section, we build a random function f (x) by specifying its wavelet coefficients ͕c j,k ͖ j,k and c . The coefficient c is chosen to be an arbitrary random variable and the ͕c j,k ͖ j,k are defined recursively in the following way: [68][69][70] ͭ c 0,0 ϭ1,
Notation 1: Since all the random variables W j,k (⑀) are i.i.d., we will often omit the indexes j,k and (⑀) and we will use W as the generic name for these variables.
As illustrated in Fig. 1, this recursive rule can be seen as a cascade process going from large scales ͑starting at scale 1͒ to smaller scales. It lies on a binary tree whose nodes are the wavelet coefficients and whose branches basically correspond ͑apart from the sign of the coefficients͒ to the same action of multiplying by W.
In the following, such a recursive rule will be referred to as a W -cascade and f (x) will be referred to as the function corresponding to the W -cascade. Let us note that both a W -cascade and its corresponding function are fully defined by the analyzing wavelet , the laws of c and W.
Let us note that the so-obtained function f (x) ͓assuming that the infinite sum in Eq. ͑4͒ converges͔ is self-similar in the sense that the law of a wavelet coefficient ͉c j 1 ,k ͉ at the scale 2 Ϫ j 1 can be linked to the law of another wavelet coefficient ͉c j 2 ,k Ј ͉ at scale 2 Ϫ j 2 Ͼ2 Ϫ j 1 using a multiplicative random variable depending only on the ratio of the two scales where ϭ l stands for the equality in law and where X n ϭ͉W 1 . . . .W n ͉ ͑the W i 's are i.i.d real valued random variables with the same law as W). Thus, from a statistical point of view, the details of the function f at a scale a 1 are the same as the details at a scale a 2 up to a rescaling factor that depends only on a 1 /a 2 .
In this section, we just gave a ''theoretical'' description of a W -cascade. Indeed, we have not proved that the sum in Eq. ͑4͒ does converge in some sense towards a random function f (x). This will be the purpose of the next section. Actually, we will not only prove that, for almost all realizations of the W -cascade, Eq. ͑4͒ does converge in L per 2 (͓0,1͔), but we will also be able to characterize some regularity properties of the limit function.
Remark: A W -cascade can be related to the M-cascades previously introduced in Refs. 44-46. An M-cascade is defined using the same recursive rule as a W -cascade ͓Eq. ͑6͔͒, but the c j,k 's no longer correspond to wavelet detail coefficients: At step j of the recursion, a measure j on ͓0,1͔ is defined by j ͓x j,k ,x j,k ϩa j ͔ϭc j,k (᭙k,0рkϽ2 Ϫ j ), where x j,k and a j are defined as in Eq. ͑5͒. In Ref. 45, the authors proved that, under certain conditions ͑on W), j converges towards a nondegenerated measure ͑when j→ϩϱ). Thus the main difference between M-cascades and W -cascades is that M-cascades are fractal measure models whereas W -cascades are fractal function models. M-cascades can be used, for instance, for modeling the energy dissipation in a FIG. 1. Sketch of the construction rule of a W -cascade. The wavelet coefficients ͕c j,k ͖ j,k lie on a dyadic grid. At each scale a j ϭ2 Ϫ j , the grid displays 2 j coefficients with abscissa x j,k ϭ2 Ϫ j k. The value of the wavelet coefficient c j,2k ͑resp. c j,2kϩ1 ) is obtained from the value of the wavelet coefficient c jϪ1,k by multiplying it by W jϪ1,k turbulent flow 29,30,[33][34][35][36][37][38][39][40][41][42] whereas W -cascades can be used directly for modeling the velocity signal of the same flow. [67][68][69][70] Moreover, as we will see in the next section, the underlying wavelet structure of a W -cascade makes the proofs for convergence of the cascade and for the characterization of the corresponding fractal function much easier than for M-cascades.
Remark: Let us note that Eq. ͑6͒ can be rewritten as If ͉W͉ is log-normal, these equations correspond to what one could call a tree-autoregressive process. This process is of order 1 in the sense that the regression involves only one term. Actually, we are currently working on higher order models. The notion of autoregressive models lying on a tree ͑including the orthonormal wavelet dyadic tree͒ has been introduced by Basseville and collaborators. 75 Let us emphasize that, besides the fact that the processes they study are autoregressive directly on the c j,k and not on their logarithm and that the processes we consider do not really correspond to autoregressive processes in the sense that they are not asymptotically stationnary ͑i.e., ͕ln c j,k ͖ j for a fixed k is not stationary even when j→ϩϱ), our approach is significantly different from theirs since we concentrate on the analysis of the fractal function f (x) itself and not on the properties of the tree-process.

A. Convergence
In order to get the convergence of the sum in Eq. ͑4͒ for a given realization of the W -cascade, we need to have upper bounds of the wavelet detail coefficients ͕c j,k ͖ j,k . Actually, we are going to study the law of the maximum of the wavelet coefficients ͕c j,k ͖ 0рkϽ2 j at a given scale 2 Ϫ j .
This is the purpose of the following lemma which is proved in Appendix A.
We are now ready to state the convergence proposition. For this purpose, we have to make two hypothesis on the law of W. Basically these hypothesis ensure that small values of W are much more probable than large values. This ''asymetry'' of the distribution function of W ensures that, for almost all realizations, the wavelet coefficients will ''fastly'' converge to 0 when the scale goes to 0 and thus that the sum in Eq. ͑4͒ will converge.
Proposition 1: ͑Convergence͒. Let us consider a given W -cascade associated with the random variable W [Eq. (6)]. If the law of W is such that (7)], then for almost all realizations of the W -cascade, the sum in Eq. (4) converges in L per 2 (͓0,1͔). Proof: Let ␣ϭ/2. Thus 0Ͻ␣ϽϪE(log 2 ͉W͉)) and F(␣)Ͻ0. By combining Lemma 1 and Lemma 2, one gets that ͑for ⑀ arbitrarily small and j large enough͒ This last inequality can be rewritten as ͚ j Prob͕Q j ␣ ͖Ͻϱ. By using the Borel-Cantelli lemma, one thus gets which is equivalent to say that for almost all realizations of the W -cascade, there exists J such that m j р2 Ϫ j␣ for jуJ. This implies that Thus the sum in Eq. ͑4͒ converges in L per 2 (͓0,1͔). ᮀ

B. Regularity
The global regularity of a function is easily characterized by its orthogonal wavelet coefficients. Indeed, one can prove 77 that f (x) is uniformly Lipschitz ␣ ͑for 0Ͻ␣ϽN ) if and only if there exists a constant C such that ͉c j,k ͉ϽC2 Ϫ j␣ for all j and k.
Let us recall that f is said to be uniformly Remark: Let us note that the 2 j/2 factor in Eq. ͑3͒ has been chosen so that the power-law behavior of ͉c j,k ͉ when j→ϩϱ directly gives the Lipschitz regularity ␣ ͑instead of ␣ϩ1/2 if there were no factor͒.
Thus, as for proving the convergence, in order to get the Lipschitz regularity of f , as long as N is large enough, one just needs to get upper bounds to the wavelet transform coefficients. All the work has already been done in the previous section. The following proposition is a direct application of Lemma 1 and Lemma 2.
The local Hölder exponent 1,53 h(x 0 ) measures the local regularity of a function f at a given point x 0 . It is defined as the greatest exponent h such that there exists a constant C and a polynomial P(x) such that One can easily prove that h(x 0 ) is greater than the ''maximum global regularity'' of f ͑i.e., the maximum ␣ such that f is uniformly Lipschitz ␣). Thus, Proposition 2 can alternatively be seen as a ''minimum local regularity'' proposition: Corollary 1: ͑Minimum local regularity͒. Under the hypothesis (H1) and (H2) of Proposition 2, for almost all realizations of the W -cascade, the local Hölder exponent of f at any point x is greater than or equal to ␣ min , i.e., ᭙x, ␣ min рh͑x ͒.
All the arguments we used in the Lemmas 1 and 2 for deriving upper bounds to the absolute value of the wavelet coefficients can easily be inverted to get lower bounds. These new lemmas will lead to a ''maximum local regularity'' proposition. We are not going to give the full proof of this proposition since it is very close to the proof of the minimum local regularity proposition. We will just give the main steps of the proof.
where a j →0 and x j,k →x 0 are defined in Eq. ͑5͒, then hуh(x 0 ). Since c j,k уm j Ͼ2 Ϫ j␣ , one gets that ␣Ͼh(x 0 ). Moreover, ␣ can be chosen arbitrarily close to ␣ max , thus ␣ max уh(x 0 ). ᮀ Remark: In most common cases, W is such that the right branch F r of F is invertible. In that case, in the same way we had Eq. ͑10͒, one gets Generally, a good way to characterize the singular behavior of a function is to compute its singularity spectrum D(h). 49 (9) and (11)].
Remark: Let us note that the Hölder exponents of a given function are fully characterized by its wavelet coefficients ͕c j,k ͖ j,k ͓through Eq. ͑12͔͒ as long as these exponents are smaller than N .
Thus the D(h) singularity spectrum of the function f (x) corresponding to a W -cascade does not depend on the analyzing wavelet that is chosen to build the cascade provided N is large enough. Particularly, as long as satisfies N Ͼ␣ max ͓where ␣ max defined in Eq. ͑11͒ depends only on the law of W͔, the cascade will correspond to the same singularity spectrum. Remark: It was proved in Ref. 81 that for any function, the left branch D l (h) of the singularity spectrum of f is smaller than the left branch F l (␣) of the spectrum obtained with the multifractal formalism. Since the singularity spectrum obtained with the multifractal formalism leads, for almost all realizations of a given W -cascade, to the function F(␣) defined by Eq. ͑7͒, one can easily prove that for any W -cascade we have D l ͑ hϭ␣ ͒рF l ͑ ␣͒.
Definition 1: From now on, the function F(␣) [Eq. (7)] will be referred to as the statistical spectrum of the W -cascade.
Both spectra D(h) and F(␣) bring valuable information on the W -cascade. The D(h) spectrum has been initially introduced for characterizing the singular behavior of deterministic fractal signals. It was proved 52,58 that, for a large class of self-similar functions, the D(h) spectrum can be obtained using the wavelet based multifractal formalism. In the case of random W -cascades, we actually get two spectra: the spectrum D(h) for each realization ͑which a priori depends on the realization͒ and the statistical spectrum F(␣) that characterizes the probability that a given singular behavior appears in a realization of the cascade. Thus, for instance, the maximum value of F(␣) corresponds to the most probable singular behavior in a realization of a W -cascade. On the other hand, the negative values of F(␣) correspond to ''rare'' events that one should not expect to observe in almost all realizations. In the next section, we will show that, in the case of W -cascades, the wavelet based multifractal formalism 50-54 actually leads to a very reliable numerical estimation of the F(␣) spectrum. Along with the kernel function 41,68-72 that we will introduce in Sec. IV and the correlation functions 73 in Sec. V, these statistical quantities allow a very accurate characterization of the random process.
Before moving on, let us first illustrate our purpose with some numerical simulations of W -cascades corresponding to different laws for W.

C. Numerical simulations of W -cascades
As stated in the Introduction, random cascade models have been introduced in the context of the phenomenological study of fully developed turbulence. 27,29,30,[33][34][35][36][37][38][39][40][41][42] They were proposed to mimic, in some sense, the kinetic energy transfer from coarser scales to smaller ones. Log-normal statistics have been first guessed, 40 years ago, by Kolmogorov 33 and Obukhov 34 in order to account for the so-called ''intermittency phenomenon'' while the ''log-Poisson'' model 39 has been recently proposed by She and Lévèque 42 as a more accurate description of this phenomenon. Let us illustrate the results discussed above on these two models when extrapolated to multifractal functions.

Log-normal W -cascades
Let us first start with W being a log-normal random variable. If and 2 are, respectively, the mean and the variance of ln͉W͉ then a straightforward computation leads to In Fig. 2, we illustrate one realization of a ''very irregular'' (␣ min ϭ0.13) log-normal W -cascade as well as one realization of a ''more regular'' one (␣ min ϭ0.3). The (q) and F(␣) spectra corresponding to the irregular W -cascade are displayed on the same figure.

Log-Poisson W -cascades
Let be the mean and the variance of the Poisson variable P. We consider that the law of ln͉W͉ is the same as P ln ␦ϩ␥. A straightforward computation leads to In Fig. 3, one realization of a log-Poisson W -cascade is shown together with the corresponding (q) and F(␣) spectra.

D. Computing the F"␣… statistical spectrum using the multifractal formalism approach
Let us imagine that we have a large number of numerical signals which correspond to different realizations of the same random function f (x). In order to characterize the self-similar behavior of the underlying cascade process, one could try to compute the F(␣) statistical spectrum. In   FIG. 2. Log-normal W -cascades. ͑a͒ Realization of the random function corresponding to a log-normal W -cascade using the ''Daubechies 5'' compactly supported orthonormal wavelet basis. 56 The law of ln͉W͉ is Gaussian with mean ϭϪ0.33 ln 2 and variance 2 ϭ0.02 ln 2. From Eq. ͑15͒ one gets that ␣ min ϭ0.13 and ␣ max ϭ0.53. ͑b͒ Realization of the random function corresponding to a log-normal W -cascade using the ''Daubechies 5'' wavelet with the following parameter values: ϭϪ0.8 ln 2 and 2 ϭ0.125 ln 2. From Eq. ͑15͒ one gets that ␣ min ϭ0.3 and ␣ max ϭ1.3. The fact that ␣ min is greater for the cascade in ͑b͒ (␣ min ϭ0.3) than for the cascade in ͑a͒ (␣ min ϭ0.13) explains why the graph in ͑a͒ appears to be much more irregular than the graph in ͑b͒. ͑c͒ The (q) function ͓Eq. ͑8͔͒ for the W -cascade illustrated in ͑a͒. The symbols (᭹) correspond to the data computed using the WTMM method ͓Eq. ͑16͔͒ with an order 2 spline wavelet on 1000 realizations of length 65536 of the W -cascade. These numerical data are in remarkable agreement with the theoretical prediction ͑solid line͒; this illustrates the fact that the determination of the (q)-spectrum can be performed using any analyzing wavelet ͑i.e., not necessarily the one that was used for building the cascade͒. ͑d͒ The F(␣) statistical singularity spectrum ͓Eq. ͑7͔͒ for the W -cascade illustrated in ͑a͒. The numerical spectrum (᭹) was obtained by Legendre transforming the (q) data in ͑c͒. The theoretical spectrum ͑solid line͒ provides a remarkable fit of the data. order to do so, one can use the wavelet based multifractal formalism approach 50-54 which consists of computing a partition function Ẑ j (q) corresponding, at each scale 2 Ϫ j , to the spatial average of the wavelet coefficients to the power q If the number of realizations is large enough, one can approximate realization averages by probability averages and get Since the law of c j,k is the same as the law of W 1 ...W j , one finally gets The (q) function ͓and consequently the F(␣) spectrum͔ is obtained by analyzing the power-law scaling of Z j (q) along the scales a j ϭ2 Ϫ j and F͑␣ ͒ϭinf q ͑ q␣Ϫ͑q ͒͒. ͑19͒ As long as the number of realizations is large, this approach leads to very precise estimations of F(␣). However, from a practical point of view, we have made a major assumption: We have assumed that the realizations of the wavelet coefficients ͕c j,k ͖ j,k were known. This is clearly not the case since the only way to recover them from the realizations of f is to compute the scalar products of these realizations with the j,k 's; but we do not know what the analyzing wavelet  56 The mean of the Poisson variable P is ϭ2 and the law of ln͉W͉ is the same as P ln ␦ϩ␥ with ␦ϭ0.88 and ␥ϭϪ0.11. ͑b͒ The (q) spectrum ͓Eq. ͑8͔͒ for the W -cascade illustrated in ͑a͒. The data (᭹) computed using the WTMM method ͓Eq. ͑16͔͒ with an order 2 spline analyzing wavelet on 1000 realizations of length 65535 of the W -cascade, are in perfect agreement with the theoretical prediction ͑solid line͒. ͑c͒ The F(␣) statistical singularity spectrum ͓Eq. ͑7͔͒ for the W -cascade illustrated in ͑a͒. The numerical spectrum (᭹) obtained by Legendre transforming the (q) data in ͑b͒ is compared to the theoretical spectrum ͑solid line͒.
is! What happens if we analyze the function f with a different analyzing wavelet 1 ? Indeed, one can show 55-57 that the new wavelet coefficients ͕c j,k (1) ͖ can be expressed as a linear combination of the old ones where K , 1 is a function that depends on and 1 only. This function basically corresponds to the scalar product of j,k with j Ј ,k Ј 1 . Let us note that the function K is localized in both variables and, from a numerical point of view, if and 1 are both well localized in Fourier and direct spaces, then the sum in Eq. ͑20͒ involves only a few terms. Before showing numerical applications, let us try to understand roughly how this affects the computation of F(␣).
Let us note that the F(␣) spectrum ͓and this is true also for any statistical quantity that is based on how the details of f (x) change along scales͔ is not changed when changing the underlying W -cascade independently on the scale. Indeed, for instance, if one multiplies each wavelet coefficient c j,k ͓defined recursively in Eq. ͑6͔͒ by a random variable X j,k whose law does not depend on j or k, then the power-law behavior of Z j (q) does not change along the scales Actually, this is exactly what happens if one replaces each coefficient c j,k by a linear combination of itself and its ''sons'' coefficients c jϩ1,2k and c jϩ1,2kϩ1

͑21͒
Indeed, it is easy to prove that the law of c j,k Ј is the same as the law of W 1 . . . W j ( 1 ϩ (l) W (l) ϩ (r) W (r) ) which can be rewritten as c j,k X j,k with X j,k ϭ 1 ϩ (l) W (l) ϩ (r) W (r) ͑which does not depend on j). It is somewhat more intricate when one performs a linear combination of the coefficients along the space axis c j,k Ј ϭ 1 c j,k ϩ 2 c j,kϩ1 .

͑22͒
One can easily prove that, among the 2 j coefficients at the scale 2 Ϫ j , 2 l have, with their right neighbor, the first common ancestor at a scale 2 Ϫl . Thus one can express the new partition function Z j where the W i (1) and the W i (2) are i.i.d. random variables with the same law as W. Let T l be the lth term in the latter sum, i.e., Let us note that the last term T jϪ1 behaves as Z j (q): where C(q) does not depend on j. Thus one just needs to get an upperbound to all the other terms where C 1 (q), C 2 (q) and C 3 (q) do not depend on j. From this last inequality and from Eq. ͑24͒, one deduces that the new partition function Z j Ј(q) ͓Eq. ͑23͔͒ behaves as expected Thus as for Eq. ͑21͒, when one performs the linear combination on the wavelet coefficients corresponding to Eq. ͑22͒, the partition function displays exactly the same power-law behavior when the scale goes to 0. Actually, one can easily prove that this result still holds when we combine the two linear combinations ͑21͒ and ͑22͒.
Proposition 4: Let us consider a given W -cascade associated to the random variable W [Eq. (6)]. Let Z j (q) be the associated partition function defined by Eq. ͑17͒ and let 1 , 2 , (l) and (r) in R. If we redefine the wavelet cascade coefficients c j,k in the following way: then the newly obtained partition function Z j Ј(q) behaves as the first one Proof: The proof is straightforward and left to the reader. ᮀ We thus expect the multifractal formalism to lead to a good determination of the F(␣) statistical spectrum independently of the considered analyzing wavelet. Actually, it is likely to provide a good estimation of the left branch of F(␣) (qϾ0) but not of the right branch which corresponds to negative values of q. Indeed, the linear combination ͑20͒ might lead to null wavelet coefficients that would induce instabilities in the computation of Ẑ j (q) ͓Eq. ͑16͔͒ for qϽ0. In order to circumvent these instabilities, one should use the Wavelet Transform Modulus Maxima ͑WTMM͒ method introduced in Refs. 50-54. It basically consists in computing the partition function only on the local modulus maxima of the wavelet transform. Let us recall that the modulus maxima ͕x i (a)͖ i of the continuous wavelet transform are defined at each scale a as the position of the local maxima of the absolute value of the wavelet transform. 80,82 These maxima lie on connected curves called maxima lines. The set of all the maxima lines existing at scale a will be denoted L(a). Then, the WTMM method consists in replacing the partition function Ẑ j (q) by a new partition function which is stable for all q's in R where T 1( b,a) corresponds to the continuous wavelet transform of f (x) at scale a and position b using the analyzing wavelet 1 .
As shown in Figs. 2 and 3, the WTMM method leads to a very good estimation of the F(␣) statistical spectrum whatever the law of W and the analyzing wavelet 1 are.

IV. THE SELF-SIMILARITY KERNEL
The function f (x) associated to a W -cascade is self-similar in the sense that the details of f at large scales are ''similar'' to its details at smaller scales up to a normalization factor. Let us look at how this property translates on the laws of the wavelet coefficients. Let P j be the probability distribution function ͑p.d.f.͒ of the coefficients ͉c j,k ͉ ( P j does not depend on k). Let P j If j 2 Ͼ j 1 then This equation can be rewritten as where * denotes the convolution product and G j 1 , j 2 (x)ϭG * . . . * G, where G(x) is the p.d.f. of log͉W͉. 41,[68][69][70][71][72] In the Fourier space, one gets In the case of the W -cascades, s( j 1 , j 2 )ϭ j 2 Ϫ j 1 represents the number of steps of the cascade from the scale 2 Ϫ j 1 to the scale 2 Ϫ j 2 . Of course, one cannot pick up any function s. It must satisfy the ''transitivity'' relation s j 1 , j 3 ϭs j 1 , j 2 ϩs j 2 , j 3 , ͑28͒ and the ''reflexivity'' relation Using any function s that satisfies both Eqs. ͑28͒ and ͑29͒, relation ͑27͒ can be seen as a first order self-similarity property that links the details at a scale 2 Ϫ j 2 to the details at a larger scale 2 Ϫ j 1 . The link is made through the self-similarity kernel G j 1 , j 2 (x) whose Fourier transform is of the form Ĝ s( j 1 , j 2 ) . In the physical space, the kernel relation becomes 41,68-72,83-87 P j 2 ͑ e x ͒ϭ ͵ G j 1 , j 2 ͑ u ͒e Ϫu P j 1 ͑ e xϪu ͒du.

͑30͒
As we have seen, in the case of W -cascades , the kernel function G j 1 , j 2 (x) depends only on j 2 Ϫ j 1 , i.e., only on the logarithm of the ratio of the two corresponding scales 2 Ϫ j 1 and 2 Ϫ j 2 . This can be seen as a ''scale-stationarity'' property of the self-similarity kernel. 38 In the following, a function that satisfies Eq. ͑27͒ with s( j 1 , j 2 )ϭ j 2 Ϫ j 1 will be referred to as a scale-similar function.
Let us note that, in the case of scale similar functions, the self-similarity kernel is directly linked to the statistical spectrum obtained by the multifractal formalism ͓Eq. ͑19͔͒. Indeed, the partition function ͓Eq. ͑17͔͒ can be rewritten as Using relation ͑27͒ in the scale-similar case ͓i.e., s( j 1 , j 2 )ϭ j 2 Ϫ j 1 ] and relation ͑18͒, one gets The self-similarity kernel and the (q) spectrum are thus linked by the relation (p), i.e., the self-similarity kernel is obtained by performing the deconvolution of P j 2 (log) by P j 1 (log) . As pointed out in the remark just below, this deconvolution requires some special care in order to avoid numerical instabilities. 68,69 Remark: In order to compute Ĝ , one has to perform a deconvolution. The deconvolution is performed in the Fourier space and thus consists basically in dividing P j 2 (log) by P j 1 (log) . This division is unstable in the neighborhood of the ͑high͒ frequencies p for which P j 1 (log) (p)Ӎ0. The smaller the support of P j 1 (log) , the slower the decay of P j 1 (log) and thus the more stable the deconvolution. In order to decrease the support of P j 1 (log) , one could compute the p.d.f. P j (log),(max) of the logarithm of the values of the modulus maxima of the continuous wavelet transform instead of computing the p.d.f. P j (log) of the logarithm of the orthogonal wavelet coefficients. Since, in the case of deterministic self-similar signals, the self-similarity properties are captured by the modulus maxima, 53,54,88 it is likely that, in the stochastic case, the self-similarity relation ͑26͒ still holds if P j (log) is replaced P j (log),(max) . Actually, from a numerical point of view, one can check [68][69][70] that it does hold with a very good precision for any scale a and not only for the dyadic scales a j ϭ2 Ϫ j . Since the support of P a (log),(max) is much smaller than the one of P a (log) , it gives a much more stable numerical method for performing the deconvolution and thus for computing the kernel G a,a Ј . Figures 4 and 5 report the results of the numerical computation of the self-similarity kernel of log-normal W -cascades when using different analyzing wavelets . [68][69][70] For the same reasons as FIG. 5. Numerical computation of the scale dependence of the self-similarity kernel Ĝ a,a Ј ( p) of the log-normal W -cascade studied in Fig. 4. ͑a͒ m(a,aЈ)ϭ‫ץ‬ Im(Ĝ a,a Ј )/‫ץ‬p͉ pϭ0 vs ln(a/aЈ); ͑b͒ 2 (a,aЈ)ϭϪ‫ץ‬ 2 (ln͉Ĝ a,a Ј ͉)/‫ץ‬p 2 ͉ pϭ0 vs ln(a/aЈ). The symbols correspond to the following values of the reference scale aЈϭ2 5 (᭹), 2 6 (᭺), 2 7 ͑͒, 2 8 (ᮀ), 2 9 (ϫ) and 2 10 (᭝). The solid lines correspond to the theoretical predictions given by Eq. ͑33͒.
FIG. 4. Numerical computation of the self-similarity kernel of a log-normal W -cascade with parameters ϭϪ0.37 ln 2 and 2 ϭ0.026 ln 2. Ĝ a,a Ј ( p)ϭ͉Ĝ a,a Ј ͉e i⌽ a,aЈ as computed for a/aЈϭ5, when using the Haar wavelet (᭹), an order 1 spline wavelet (᭺) and the complex Morlet wavelet (). 56 ͑a͒ ͉Ĝ a,a Ј ͉ vs p; ͑b͒ ⌽ a,a Ј vs p. The solid lines correspond to the theoretical predictions given by Eq. ͑32͒. the ones we previously mentioned when estimating the F(␣) statistical spectrum, since Ĝ depends only on (q) ͓Eq. ͑31͔͒, its estimation should not depend upon the choice of . This is clearly verified in the numerical simulations reported in Figs. 4 and 5. Moreover, the data shown in Fig.  4 are in remarkable agreement with the theoretical shape of a log-normal kernel Ĝ a,a Ј ͑ p ͒ϭe ipm͑a,aЈ͒Ϫp 2 2 ͑a,aЈ͒/2 , ͑32͒ where ͭ m͑a,aЈ͒ϭ ln͑a/aЈ͒, 2 ͑ a,aЈ͒ϭϪ 2 ln͑a/aЈ͒. ͑33͒ For values of ͉p͉р7, one does not see in the numerical data any significant departure from the Gaussian behavior of the kernel modulus ͉G a,a Ј ͉, as well as from the linear behavior of its phase ⌽ a,a Ј . As far as the scale dependence of the self-similarity kernel is concerned, we have plotted in Fig. 5, m(a,aЈ)ϭ‫ץ‬ Im(Ĝ a,a Ј )/‫ץ‬p͉ pϭ0 and 2 (a,aЈ)ϭϪ‫ץ‬ 2 (ln͉Ĝ a,a Ј ͉)/‫ץ‬p 2 ͉ pϭ0 as functions of ln(a/aЈ). One can see that, for different values of the reference scale aЈ, all the points obtained when varying the scale a fall on a unique straight line which matches perfectly the theoretical predictions ͓Eq. ͑33͔͒ and confirms the scale-similarity of the log-normal W -cascade under study.

V. CORRELATION FUNCTIONS IN W -CASCADES
The tree structure of a W -cascade induces correlations between different details of the corresponding function f (x). 73,88,89 These correlations can be characterized by computing the correlation between two wavelet coefficients at an arbitrary scale aϭ2 Ϫ j and at a distance ⌬x ϭ2 Ϫ j ⌬k. Since the wavelet coefficients ͕c j,k ͖ k at a given scale 2 Ϫ j are not stationary in k, we will compute an ''averaged version'' of the correlation function: 73 Proposition Let us fix k and set k 1 ϭk, k 2 ϭkϩ⌬k. Let us suppose that the last common ancestor ͑on the binary tree of the W -cascade͒ of c j,k 1 and c j,k 2 is at scale 2 Ϫd( j,k 1 ,k 2 ) ͓in the following d( j,k 1 ,k 2 ) will be referred to as the W -distance between the two wavelet coefficients͔.
From this Proposition, one easily deduces the asymptotic behavior of the correlation function: 73 Corollary 3: When ⌬x is small (aϽ⌬xӶ1), the correlation function C(⌬x,a) ͓Eq. ͑34͔͒ of a W -cascade behaves as a logarithm function C͑⌬x,a ͒ϭϪ 2 log 2 ͑ ⌬x ͒ϩo͑ ⌬x ͒. ͑36͒ Thus, asymptotically, the correlation function does not depend on the scale a. From a numerical point of view, the cascade is constructed from the scale 1 ( jϭ0) down to a small scale ͑corresponding to the sampling rate of the numerical signal͒ 2 ϪJ . If, on the contrary, we consider that the sampling rate is 1, then the signal has a total size Lϭ2 J . Increasing J amounts in building a signal longer. The last corollary means that 73 C͑⌬x,a ͒ϳ 2 log 2 ͑ L/⌬x ͒, ͑37͒ when aϽ⌬xӶL.
Using the same kind of computations, one gets that the ''two-scale'' correlation function C(⌬x,a,aЈ) between the coefficients at scale a and the coefficients at scale aЈ actually follows the same law as C(⌬x,a) as long as ⌬x is greater than the supremum of a and aЈ 73 C͑⌬x,a,aЈ͒ϳ 2 log 2 ͑ L/⌬x ͒, ͑38͒ when sup(a,aЈ)Ͻ⌬xӶL. All these results are illustrated in Fig. 6 in the case of a log-normal W -cascade. As seen in Figs. 6͑a͒ and 6͑b͒, the numerical computation of both the ''one-scale'' C(⌬x,a) and the ''twoscale'' C(⌬x,a,aЈ) correlation functions are in very good agreement with the theoretical predictions given by Eqs. ͑37͒ and ͑38͒.
Remark: By the same kind of arguments as the ones used in Sec. III, one expects Eqs. ͑37͒ and ͑38͒ to hold even when computing the correlation functions using an analyzing wavelet 1 which is different from the wavelet used to build the W -cascade.

VI. CONCLUSION
To summarize, we have presented a first theoretical step towards a rigorous mathematical treatment of random cascading processes on the dyadic tree of their orthogonal wavelet coefficients. We have elaborated on the convergence of these W -cascades and discussed the regularity of the limiting random functions by studying the support of their singularity spectra. We have shown mathematically and checked numerically on various computer synthetized signals, that very different statistical quantities such as the statistical spectrum, the self-similarity kernel and the correlation functions can be extracted directly from the fractal function using its wavelet decomposition ͑orthogonal, continuous or its associated modulus maxima͒ with an arbitrary analyzing wavelet. This mathematical study actually provides algorithms that are readily applicable to experimental situations. Recent applications of our methodology in the context of fully-developed turbulence 69,70,73 have revealed the existence of a ͑nonscale invariant͒ log-normal cascading process underlying the turbulent velocity fluctuations. More surprising are the results of a similar investigation of financial time series. 90 Underlying the fluctuations of the volatility ͑standard deviation͒ of the price variations, there exists a causal information cascade from large to small time scales that can be visualized with the wavelet representation. Let us emphasize that the fact that variations of prices over a one month scale influence in the future the daily price variations, is likely to be extraordinarily rich in consequences and this, not only for the fundamental understanding of the nature of financial markets, but also ͑and maybe more important͒ for practical applications. Indeed, the nature of the correlations across scales that are implied by this causal cascade has profound implications on the market risk, a problem of utmost concern for all financial institutions as well as individuals. These preliminary results are very promising as far as further experimental investigations of multiplicative cascade processes are concerned. There is no doubt in our minds that similar wavelet-based statistical analysis will lead to significant progress in fields other than hydrodynamic turbulence and finance.

ACKNOWLEDGMENTS
We are very grateful to S. Manneville and S. G. Roux for interesting discussions and technical assistance.

APPENDIX A: PROOF OF LEMMA 1
We want to prove the following lemma ͑Sec. III A͒ Lemma 1: Let us consider the wavelet coefficients ͕c ,͕c j,k ͖ j,k ͖ of a given W -cascade associated to the random variable W [Eq. (6)]. Let m j ϭmax k ͉c j,k ͉ (m 0 ϭ1) and Q j ␣ the subset of the probability space ⍀ Q j ␣ ϭ͕⍀,m j Ͼ2 Ϫ j␣ ͖.  Proof: By decomposing the binary tree into two binary subtrees, we obtain q j ␣ ϭProb͕m j Ͼ2 Ϫ j␣ ͖, where W (l) and W (r) are i.i.d. random variables with the same law as W and where m jϪ1 (l) and m jϪ1 (r) are i.i.d. random variables ͑independent from W (l) and W (r) ) with the same law as m jϪ1 . Since all the involved variables are independent, we get where W 1 is a random variable ͑independent from m jϪ1 ) with the same law as W. By again decomposing each subtree into two subtrees, we get q j ␣ ϭ1ϪProb͕͉W 1 ͉m jϪ2 ͑ l ͒ ͉W ͑ l ͒ ͉р2 Ϫ j␣ and ͉W 1 ͉m jϪ2 with the same notations as before. This time, since the same variable W 1 appears on both sides of the ''and,'' we cannot just split the two terms on each side of the ''and'' and keep the equality with q j ␣ . Actually one can easily prove that for any independent random variables X 0 , X 1 , and Y , if X 0 and X 1 have the same law then Prob͕X 0 Y рa and X 1 Y рa͖уProb͕XY рa͖ 2 .