Stochastic dynamics of adaptive trait and neutral marker driven by eco-evolutionary feedbacks

How the neutral diversity is affected by selection and adaptation is investigated in an eco-evolutionary framework. In our model, we study a finite population in continuous time, where each individual is characterized by a trait under selection and a completely linked neutral marker. Population dynamics are driven by births and deaths, mutations at birth, and competition between individuals. Trait values influence ecological processes (demographic events, competition), and competition generates selection on trait variation, thus closing the eco-evolutionary feedback loop. The demographic effects of the trait are also expected to influence the generation and maintenance of neutral variation. We consider a large population limit with rare mutation, under the assumption that the neutral marker mutates faster than the trait under selection. We prove the convergence of the stochastic individual-based process to a new measure-valued diffusive process with jumps that we call Substitution Fleming-Viot Process (SFVP). When restricted to the trait space this process is the Trait Substitution Sequence first introduced by Metz et al. (1996). During the invasion of a favorable mutation, a genetical bottleneck occurs and the marker associated with this favorable mutant is hitchhiked. By rigorously analysing the hitchhiking effect and how the neutral diversity is restored afterwards, we obtain the condition for a time-scale separation; under this condition, we show that the marker distribution is approximated by a Fleming-Viot distribution between two trait substitutions. We discuss the implications of the SFVP for our understanding of the dynamics of neutral variation under eco-evolutionary feedbacks and illustrate the main phenomena with simulations. Our results highlight the joint importance of mutations, ecological parameters, and trait values in the restoration of neutral diversity after a selective sweep.


Introduction
The science of biodiversity currently faces the challenge of understanding how ecological processes shape evolutionary change, and reciprocally how evolution affects the structure and function of ecological systems (Schoener [43]). Such eco-evolutionary feedbacks determine the dynamics of so-called adaptive traits -quantitative characters that are heritable yet mutable from parent to offspring (Dieckmann Law [15], Metz et al. [36]). Under the combined assumptions of large population and rare mutation scalings, the time evolution of an adaptive trait can be described as a sequence of mutant invasions, each being driven by positive selection in the ecological context set by the 'resident' value of the adaptive trait (Metz et al. [37]). The resulting evolutionary model is a jump process called the Trait Substitution Sequence (TSS): every new trait mutant either goes extinct, or replaces the resident, causing the TSS to jumps from the former resident population equilibrium to a new equilibrium (Metz et al. [36], Champagnat [4] and Champagnat et al. [5]). In population genetics, these jumps are known as selective sweeps (Barton [1], Stephan et al. [46]). Previous works by [30,9,24,47,31,18] support the view that the TSS as a model of long-term phenotypic evolution is relatively insensitive to the details of the genetic determination of the trait.
Whereas eco-evolutionary feedbacks can result in variation of adaptive traits among populations (and even within populations when evolutionary branching occurs, Geritz et al. [22]), much of the molecular diversity measured by population geneticists involve DNA sequences of no known adaptive value, i.e. selectively neutral. A neutral sequence that is physically linked in the genome to the sequence that codes for the adaptive trait is called a marker of that trait. A longstanding question in evolutionary theory is understanding how variation in such molecular markers evolves, and how patterns of neutral molecular evolution can be used to infer the history of trait mutation that have driven past adaptation.
When adaptive mutations are rare, adaptation proceeds as a series of selective sweeps: a trait mutation occurs while the population is monomorphic for the trait, and increases rapidly in frequency toward fixation. Following on from Kojima and Schaffer [28], Maynard Smith and Haigh [33] pointed out that selective sweeps purge genetic variation at linked sites: a particular marker allele goes to fixation as a consequence of linkage with the selected allele, a phenomenon they dubbed the 'hitchhiking effect'. Maynard Smith and Haigh's deterministic model was revisited in a stochastic approach by Ohta and Kimura [39]. These seminal studies of hitchhiking focused on the short-term dynamics of an interaction between two alleles at the locus under selection and two alleles at the neutral locus. Long-term dynamics were considered first by Kaplan et al. [27] who developed a stochastic model for finite populations to describe the effect of recurrent hitchhiking. In order to describe stationary levels of nucleotide diversity at the marker locus, they used the infinite site model and a coalescent approach under the assumption of constant population size and constant selection coefficients. This has generated an abundant theoretical literature on modeling the impact of selection on neutral polymorphism (Barton [2], Etheridge et al. [20], Durrett-Schweinsberg [17] and references therein). Recent deterministic models have relaxed the assumption of constant selection either because of the presence of genetic backgrounds (e.g. assuming a quantitative trait [8]) or in the case of a parasite, because of the complexity of the demographic events involved in the life cycle [42]. All previous models assume constant population size and constant selection, or that the population size is independent of the selective value of the individuals.
In this article, our goal is to relax these key assumptions. Under general ecological scenarios, eco-evolutionary feedbacks operate: as the adaptive trait evolves, population size and selection co-vary. The eco-evolutionary process of adaptive trait and neutral marker dynamics requires a rigorous mathematical framework, the foundation of which we establish here. We start with a 'microscopic', individual-based model where individuals have two heritable characteristics: (i) an adaptive trait that influences their intrinsic demographic rates and ecological interactions, and (ii) a genetic marker that has no demographic or ecological effects, hence, is selectively neutral. This work focuses entirely on asexual populations and short genomic regions that remain perfectly linked to the loci under selection, neglecting recombination. The population is described by a measure according to which each individual is represented by a Dirac mass that weights its characters. This leads to study the population eco-evolutionary dynamics as a measure-valued stochastic process.
The dynamics are driven by competition between individuals, asexual reproduction without or with mutation, and death. Variation in population size and selection as the trait evolves are mediated by the demographic effects of change in the trait. These effects are expected to influence the generation and maintenance of neutral variation.
The effect of mutation on the marker can be continuous or discrete. Our framework thus encompasses a variety of conventional mutation models such as the two-alleles model, the stepwise mutation model, and the continuous state mutation model. Our distinctive assumption here is that the marker mutation process is much faster than the trait mutation process but much slower than the ecological time-scale of birth and death events. This is supported by the fact that most mutations are neutral or nearly neutral (such as mutations involved in microsatellite variation). Therefore, there are three time scales in the model: the fast ecological time scale of birth and death events, the slow time scale of trait mutation, and an intermediate time scale of marker mutation. We study the joint process of trait and marker dynamics on the trait mutation time scale.
We are interested in limit theorems when the population carrying capacity goes to infinity. Then, the population size stabilizes in a neighborhood of the ecological equilibrium and jumps to another equilibrium when a successful trait mutant goes to fixation in the population. This is the TSS dynamics of the adaptive trait. It does not depend on the marker and has been mathematically proved by Champagnat [4]. The novelty in the model and in the proofs come from the time-scales difference for the marker and trait mutations. The study of the marker distribution during the invasion period requires careful consideration of the individual process and of the different scales involved. In a first period, starting with the single invading mutant, we prove that the marker distribution remains close to a Dirac mass at the value of the initial mutant. Until the next jump of the TSS, the marker evolves as a stochastic distribution-valued process. In the case where the marker mutation effects are continuous and small, this is a Fleming-Viot process whose drift and covariance depend on the resident adaptive trait. In every cases, for any marker mutation model, the collated dynamics define a measure-valued diffusive process with jumps that we call Substitution Fleming-Viot Process (SFVP). The convergence of the microscopic process to the SFVP is shown both in the sense of finite dimensional distributions and in the sense of occupation measure, thus improving previous results of Champagnat [4].
From a biological standpoint, we recover the conventional hitchhiking phenomenon: when a new mutant trait appears and sweeps through the population to fixation, the marker carried by the mutant individual is hitchhiked, and the marker distribution undergoes a genetical bottleneck. The mathematical construction of the SFVP process has new implications of biological relevance. Neutral diversity is restored after each adaptive jump, but as the adaptive trait evolves, population size, the mutation rate, genetic drift and demographic fluctuations change, which causes the rate of neutral polymorphism build-up and the moments of the marker distribution to change too. This suggests that the nature and structure of the whole eco-evolutionary feedback loop (i.e. how adaptive traits influence demographic rates and ecological interactions, and how ecological processes shape selection pressures on adaptive traits) may be important to explain the extreme disparities in genetic neutral diversity observed among species, even closely related ones and in the absence of differences in recombination profiles (Cutter and Payseur [11]). In fact, it is well-known that demographic differences due to external causes (demographic bottleneck or population expansion due to environmental changes) can affect neutral diversity of a population and that closely related species can show very different neutral diversity patterns. Here, we show that internal causes of demographic variation involved in adaptation can also affect species differently.
The article is organized as follows. In Section 2, we start with the model description. The stochastic individual-based process and its key assumptions are carefully described and examples are provided. A key parameter is K, an integer that gives the order of the population size and is used to rescale the mutation rates and kernels. By letting K go to infinity we study the large population limit of the stochastic process. The main theorem is enounced and discussed in Section 3, where biological implications are also highlighted. Time scale separations implied by the dependence in K of the trait and marker mutations lead to homogenization phenomena and then to the SFVP. Our mathematical analysis provides a precise description of the genetical bottleneck that occurs at each trait substitution. We show that the marker of the initial mutant individual dominates in the marker distribution of the mutant population until this population reaches a neighborhood of the new ecological equilibrium. Then, we present two numerical examples based on an ecological model adapted from Dieckmann and Doebeli [14]. In the first example, marker mutation is described by a continuous state model that leads to a piecewise Fleming-Viot process (section 3.3) for the marker. In the second example, marker mutation follows a discrete two-allele model; then the classical Wright-Fisher diffusion (3.5) is recovered. Further generalizations are discussed. The proof of the main theorem in the adaptive dynamics scaling is in Section 4. After having introduced a semi-martingale decomposition of our stochastic measure-valued process, we start with recalling and refining the result of Champagnat [4] for the convergence of trait-marginals. For this purpose, we introduce the M1-topology on the Skorokhod space where the TSS lives, using some ideas of Collet et al. [10]. This allows us to obtain the convergence to the TSS for the topology of occupation measure, hence providing additional pathwise information that complement the results of [4]. The second part of the proof focuses on the marker distribution in an invading mutant population. This gives the result on the genetical bottleneck. Then, between two trait substitutions, the dynamics of the marker converges to a diffusive measure-valued process. As a conclusion of the proof, we show the convergence to the SFVP for the topology of occupation measures.

The stochastic model
We consider an asexual population driven by births and deaths where each individual is characterized by hereditary types: a phenotypic trait under selection and a neutral marker. The trait and marker spaces X and U are assumed to be compact subsets of R. The type of individual i is thus a pair (x i , u i ), x i ∈ X being the trait value and u i ∈ U its neutral marker. The individual-based microscopic model from which we start is a stochastic birth and death process with density-dependence whose demographic parameters are functions of the trait under selection and are independent of the marker. We assume that the population size scales with an integer parameter K tending to infinity while individuals are weighted with 1 K . At any time t ≥ 0, we have a finite number N K t of individuals, each of them holding trait and marker values in X × U. Let us denote by ((x 1 , u 1 ), . . . , (x N K t , u N K t )) the trait and marker values of these individuals. The state of the population at time t ≥ 0, rescaled by K, is described by the point measure where δ (x,u) is the Dirac measure at (x, u). This measure belongs to the set of finite point measures on X × U with mass 1/K. This set is a subset of the set M F (X × U) of finite measures on X × U, which is embedded by the weak convergence topology. We denote by ν, f the integral of the measurable function f with respect to the measure ν and by For any t ≥ 0, we also introduce the trait marginal of the measure ν K t on X , denoted by X K t and defined by Therefore, the population measure ν K t writes where π K t (x, du) is the marker distribution for a given trait value x defined by Our purpose is to study the asymptotic behavior of the measure-valued process ν K at large times, when the trait and marker are inherited but mutations occur. The main interest of our model is that these mutations happen at different time scales for trait and marker, but both longer than the individuals lifetime scale. The trait mutates much slower than the marker and drives the evolution time scale. Thus, the limiting behavior results from the interplay of three time scales: births and deaths, trait mutations and marker mutations.
We describe the individuals' life history. The trait has an influence on the ability of individuals to survive (including competition with other ones) and to reproduce but the marker is neutral. The demographic parameters are thus functions of the trait only and are defined on X .
Assumption 2.1 • An individual with trait x and marker u reproduces with birth rate given by 0 ≤ b(x) ≤b, the function b being continuous.
• Reproduction produces a single offspring which usually inherits the trait and marker of its ancestor except when a mutation occurs. Mutations on trait and marker occur independently with probabilities p K and q K respectively. Mutations are rare and the marker mutates much more often than the trait. We assume that • When a trait mutation occurs, the new trait of the descendant is x + k ∈ X with k chosen according to the probability measure m(x, k)dk.
• When a marker mutation occurs, the new marker of the descendant is u + h ∈ U with h chosen according to the probability measure G K (u, dh).
For any u ∈ U, G K (u, .) is approximated as follows when K tends to infinity: where (A, D(A)) is the generator of a Feller semigroup and φ ∈ D(A) ⊆ C b (U, R), the set of continuous bounded real functions on U.
• An individual with trait x and marker u dies with intrinsic death rate 0 ≤ d(x) ≤d, the function d being continuous. Moreover the individual experiences competition the effect of which is an additional death rate . The quantity C(x − x i ) describes the competition pressure exerted by an individual with trait x i on an individual with trait x. We assume that the functions C and η are continuous and that there exists η > 0 such that ∀x, y ∈ X , η(x) C(x − y) ≥ η > 0. (2.6) A classical choice of competition function C is C ≡ 1 which is called "mean field case" or "logistic case". In that case the competition death rate is η(x)N K t /K.

Remark 2.2
Let us insist on the generality of Assumption (2.5) which allows a larger set of possible dynamics.
Choosing for example This choice can be seen as a continuous state generalization of the stepwise mutation model [38].
• If in addition the distribution G K has a non zero mean µ K such that r K µ K K → µ > 0 corresponding to a mutational directional drift, then the operator A will be defined by • If we relax the compactness of U and assume that U = R, a third choice consists in taking for G K the law of a Pareto variable with index α ∈ (1, 2) divided by K η/α , for η ∈ (0, 1]. Then it has been proved in Jourdain et al. [26] that is the fractional Laplacian with index α. Thus if we take r K such that r K K 1+η converges as K tends to infinity, and choose A = D α in (2.5), Assumptions (2.4)-(2.5) will be satisfied as soon as η < 1.
• Another very interesting case is the discrete case when U = {a, A} is a set of two alleles. The mutation kernel is given by In this case, (2.5) implies that r K /K has a limit when K → +∞. Letr be this limit, then We see that the ratio between the two mutation probabilities r K = q K /p K that allows convergence is highly dependent on the mutation distribution.
Note that since the demographic rates do not depend on the marker, the dynamics of the population distribution of the trait is independent of the marker distribution. But the dynamics of the marker distribution cannot be separated from the trait distribution as we shall see.
The process (ν K t , t ≥ 0) is a càdlàg M F (X × U)-valued Markov process. Existence and uniqueness in law of the process can be adapted from [21,5] under the assumption that

Moreover, Assumption (2.6) allows to prove as in Champagnat [4, Lemma 1] that if for
which will be useful to study the tightness and convergence of the sequence.

Convergence to the Substitution Fleming-Viot Process
The adaptive trait mutation time scale is the slowest, equal to 1 Kp K = K by Assumptions (2.4). It scales the evolutionary time. So we shall consider the limiting behavior of (ν K Kt , t ≥ 0). We will see in section 4.2.1 that p K of order 1/K 2 is the only choice which leads to a non-trivial or non-degenerate marker dynamics.
Before stating our main result, we introduce several important ingredients which are used to describe the limit of (ν K Kt , t ≥ 0) when K → +∞. We conclude the section with extensions and simulations.

Invasion fitness function
The large population behavior of the process (ν K t , t ≥ 0) as K tends to infinity, can be studied by classical arguments and is given in the appendix. At the ecological time scale (of order 1), no mutation occurs in the asymptotic K → +∞. If the initial population has a single adaptive trait x, then, in the limit K → +∞, the trait distribution remains δ x since p K and q K vanish in the limit. The rescaled population size process (N K t /K, t ≥ 0) converges to the solution (n t , t ≥ 0) of the ordinary differential equation which converges when t tends to infinity to the equilibrium Conversely, at the adaptive trait-mutation time scale Kt, new mutant traits can invade. If they replace the previous traits, then the corresponding event is called "fixation". The probability of fixation of a mutant trait y in a trait resident population x at equilibrium depends on the invasion fitness function f (y; x): This fitness function describes the initial growth of the mutant population. It does not depend on the neutral marker.
By simplicity we work under the assumption of 'invasion implies fixation', but this assumption will be relaxed in Section 3.5. When a mutant trait appears, either its line of descent replaces the resident population or it disappears. As a consequence, two traits can not coexist in the long term.

Assumption 3.1 ("Invasion implies fixation")
For all x ∈ X and for almost every y ∈ X ,

Remark 3.2
In the case of logistic populations with C ≡ 1, this assumption is satisfied as soon as x → n x is strictly monotonous.

Main theorem
Let us first give the definition of the Fleming-Viot process which will appear in our setting (see e.g. Dawson and Hochberg [13], Dawson [12], Donnelly and Kurz [16], Etheridge [19]). We recall that the operator A has been introduced in (2.5).
In the sequel, we denote by P(U) and P(X ×U) the probability measure spaces respectively on U and on X × U.

Definition 3.3 Let us fix
x ∈ X and u ∈ U. The Fleming-Viot process (F u t (x, .), t ≥ 0) indexed by x, started at time 0 with initial condition δ u and associated with the mutation operator A is the P(U)-valued process whose law is characterized as the unique solution of the following martingale problem. For any φ ∈ D(A), is a continuous square integrable martingale with quadratic variation process Let us now state our main theorem that describes the slow-fast dynamics of adaptive traits and neutral markers at the (trait) evolutionary time scale.
Theorem 3. 4 We work under Assumptions 2.1 and 3.1. The initial conditions are The convergence holds in the sense of finite dimensional distributions on M F (X × U).
In addition, the convergence also holds in the sense of occupation measures, i.e. the measure ν K Kt (dy, dv)dt on X × U × [0, T ] converges weakly to the measure n Yt δ Yt (dy) F Ut t (Y t , dv)dt for any T > 0.
We observe that the Substitution Fleming-Viot Process includes the three qualitative behaviors due to the three different time scales: deterministic equilibrium for the transitory size of the population (driven by the ecological birth and death events), transitory diffusive behavior for the marker distribution (driven by marker mutation), jump process for the trait distribution (driven by adaptive trait mutation).
Remark 3.6 Equations (3.4)-(3.5) have important biological implications regarding neutral genetic diversity. Once the fixation of a favorable mutation has occurred and the population is monomorphic for the selected trait, the evolution of the neutral marker distribution is described by a Fleming-Viot process whose law is given by the martingale in (3.4). The bracket of the martingale in (3.5) shows that the stochastic fluctuations with time of the marker distribution are due to randomness in births and deaths and mutations. The multiplicative factor 2b(x)/ n x in (3.5) depends on the trait value x, and on the assumed ecological model which determines the relationships between x, the death and birth rates and the competition kernel. Notice that 2b(x)/ n x corresponds to the quotient of variance (here 2b(x)) divided by effective size N e (here n x ) that appears in the usual Wright Fisher equation. The quantity n x corresponds to the mass of the population when there is an infinite number of small individuals; if the size of the population is of order K, it means that there is approximately n x K individuals of weights 1/K. The right term in (3.4), (i.e. the drift term in a mathematical sense) involves the generator A and is associated with the mutation model as seen in Assumption (2.5). The generator A describes the speed at which the neutral diversity is restored. For instance in a continuous state model, if Aφ = σ 2 2 φ , we recover the heat equation whose solutions have a variance in t. In a discrete state model similar to (2.8), this equation gives the growth of the support. In short, (3.4)- (3.5) shows that the distribution of the neutral marker depends on ecological processes and their parameters: every changes in x will result in changes in the distribution of the neutral marker, through changes in birth, death and mutation rates, in competition and equilibrium population size. This result is biologically relevant and important since it differs from the assumptions of classical genetic hitchhiking models, in which selection and population size remain constant, leading to the fact that the neutral diversity restoration will not depend on the trait substitution and its history. In examples below, we will give more detailed results regarding about the distribution of the neutral marker changes.
The proof of Theorem 3.4 is the subject of Section 4. The trait dynamics in the limit of Theorem 3.4 is the Trait Substitution Sequence obtained in Champagnat [4, Theorem 1] whose assumptions are satisfied. Our main contribution in Theorem 3.4 is to prove that at the adaptive trait mutation time scale, a homogeneization phenomenon takes place. There is a deterministic limit for the fastest process (the births and deaths leading to n x ), and stochastic limits for the two slower processes. The limiting process (V t , t ≥ 0) is a measure-valued process with jumps (corresponding to trait mutations) and diffusion (corresponding to marker dynamics). If the population is trait-monomorphic with trait x, the jump rate is When a jump occurs at t, the process jumps from (x, u) to (x + k, v) where k is chosen in m(x, k)dk and v is chosen at time t in the marker distribution F u t (x, dv). The marker distribution is the second fastest-evolving component, but marker mutations are assumed small (2.5), allowing to recover a non-degenerate Fleming-Viot superprocess parameterized by the trait of the population but with jumps. Between the jumps, this superprocess is the pathwise limit of the marker dynamics where traits are fixed. The jumps are hitchhiking events due to the trait mutations (see in another context Etheridge, Pfaffelhuber and Wakolbinger [20]). There is a bottleneck at each successful invasionfixation of mutant traits. Indeed, the individuals present at the fixation time are all descendants of the successful initial mutant. The trait and marker of the latter alone determine the state of the new mutant population, hence creating the bottleneck for the whole population genealogy. This result is biologically intuitive since we assume that the neutral marker and the trait are completely linked, but the mathematical proof of these phenomena is the hardest part of the proof of Theorem 3.4, and we will show that our results still have biological interest. Extending this model to the case of recombination is a challenging problem for future work (see [45] in this direction). It is also worth to notice that contrarily to other extensions of the TSS (e.g. the TSS with age-structure of [34] or the Polymorphic Evolution Sequence for a multi-resource chemostat in [6]) that usually jump from an equilibrium to another equilibrium, the marker distribution is here described by a stochastic process and not an equilibrium measure. This is due to the fact that the time scales of the trait and marker mutations are assumed different: in the time scale of marker mutations, the trait mutations are too rare and not seen.
An illustration of the invasion and fixation phenomena is summed up in Fig. 3.1.

An example from Dieckmann and Doebeli
Let us first illustrate our model by simulations based on an example inspired from Roughgarden [41] and Dieckmann and Doebeli [14]. Here   Let v be the marker of the mutant individual. As in Champagnat et al. [5], the fluctuations of the resident population can be neglected in first approximation and the mutant population evolves as a birth and death process with rates b(x + k) and d(x + k) + η(x + k)C(k) n x , independent of the marker distribution. When the mutant population reaches a sufficient size ε at time t 2 , with probability [f (x+k, x)] + /b(x+k), the 'invasion implies fixation' assumption leads to the replacement of the former population in a time t K such that t K / log(K) → ∞. This time interval is too short to allow other marker mutant to appear in non-negligible proportion, with large probability. Thus, when the mutant population has fixed, at time σ 1 , it is close to n x+k δ (x+k,v) . Before the next adaptive trait mutation occurs, the marker mutates a lot, since marker mutations happen on a faster scale. The dynamics of the marker distribution is then the one of a Fleming-Viot superprocess started at δ v and with statistics depending on x + k.
Here, the 'optimal trait' is x = 0 where the birth rate has its maximum and the population is governed by local competition. We start with the initial condition: The simulations (see Fig. 3.2) illustrate Theorem 3.4. They show the replacement of a resident population by a mutant population. In Fig. 3.2 (a), the dynamics of the support of the marker distribution is represented. The mutant and resident populations are pictured together and separately to better observe the extinction of the resident population (black) and the expansion of the mutant population from one individual (light). The invasion started around time 3175 is quick and after time 3250, the mutant population has totally replaced the resident one. In Fig. 3.2 (b), the histograms of traits and markers at three times during the invasion are represented simultaneously, to underline the hitchhiking effect of the marker during the 'invasion implies fixation' phase. We can see that the distribution of the marker during the fixation remains close to a Dirac mass at the marker value of the initial mutant (red line). The marker value of the initial mutant is indicated by the red line. When the mutant trait appears, the resident population is quickly invaded by the mutant population during a transition period. In (b), we can see that if the support of the marker distribution for the resident population remains wide (see also (a)), the size of the resident population decreases quickly. In the second column of (b), we see that the marker distribution in the mutant population remains spiked at the marker value of the first mutant individual during the whole transition period. After invasion (see (a)), the spread of the marker distribution follows the Fleming-Viot process (3.4)-(3.5). On (a), we see that for the Fleming-Viot process, the support of the marker distribution spreads slowly.
This illustrates the bottleneck phenomenon, the existence of which we prove rigorously (Equation the trait x approaches the evolutionary stable strategy (ESS, see [32]) and b(x) increases, since the equilibrium size is greater and the diffusion coefficient is lower. The drift term is b(x) σ 2 2 t 0 F u s (x, .), φ ds and thus the multiplicative factor b(x) increases when approaching the ESS, contrarily to the multiplicative factor of the bracket (3.5). In the case d ≡ 0, the Fleming-Viot process has a constant diffusion coefficient and the bracket (3.5) does not depend on x. The Fleming-Viot process depends only on the trait x through the drift term. Notice that this is true for any mutation model satisfying (2.5). This simple result illustrates how the ecological processes can shape the neutral diversity.

Corollary: the Wright-Fisher Evolutionary Process
There exists a version of the SFVP in the case when the marker space U is discrete. Assume for instance that there exist only two alleles of the marker trait, denoted by a and A, so that U = {a, A}. In this case, we apply Theorem 3.4 with the mutation kernel G K defined in (2.7) and r K /K →r > 0 when K → +∞.

Proposition 3.7
We work under Assumptions 2.1 and 3.1 with probabilities q A and q a to mutate from marker A to marker a and from marker a to marker A. Moreover, we consider similar initial conditions ν K 0 as in Theorem 3.4. Then, the population process where (Y t , t ≥ 0) is the TSS process that jumps from x to x + k in X with the jump measure b(x) n x m(x, k)dk and where (W a t , t ≥ 0) is the following Wright-Fisher jump process that represents the proportion of alleles a in the population of trait Y t at time t. Between jumps, it satisfies the usual Wright-Fisher equation with mutations (B t , t ≥ 0) being a standard Brownian motion. It jumps with the TSS and at jump time t, the process (W a t , 1 − W a t ) goes to (1, 0) with probability W a t and to (0, 1) with probability 1 − W a t .
An illustration of this theorem is given in Fig. 3.3.
This result can be generalized to discrete marker spaces U = {a 1 , . . . a m }, by introducing the transition probabilities q ij to mutate from a i to a j , i, j ∈ {1, . . . , m}. An application is when the marker corresponds to the genetical sequence of n nucleotides (A, T , G or C for each position). In this case, m = Card U = 4 n .
Traditionally in a population genetics framework, the evolution in finite populations of the diversity at a neutral marker is described as a diffusion process with two fixed parameters: the population size and the mutation "rate" (e.g. Crow and Kimura 1970). The population size is related to what is called the "genetic drift" and generally refers to the random sampling of gametes performed for reproduction at the beginning of each generation, and the higher the population size, the lower the genetic drift. Under this framework, genetic drift induces stochastic fluctuations in the frequencies of the alleles A and a and can cause the decrease of neutral genetic diversity when an allele is randomly lost. On the other hand, mutation introduces continuously allele A and a in the population and thus allows the restoration and the maintenance of neutral genetic diversity. It is important to note that under the population genetics framework, mutation rates and population size are fixed and do not depend on the ecological processes and their parameters, neither on the trait value when the population is monomorphic for the trait under selection. As a consequence, those parameters do not change as successive selective sweeps occur especially during the adaptation process. Here we can use Equation (3.8) and try to compare the classical population genetics results about the distribution of neutral diversity and the one in our model. In an eco-evolutionary framework, (3.8) first shows that mutation rates and population size, i.e. the genetic drift, are not fixed and depend on the ecological processes and on the trait value x. The mutation rates arer b(Y t )q A andr b(Y t )q a in our framework while it is only q A and q a under a population genetics framework (e.g. Crow and Kimura 1970). The genetic drift, i.e. the equilibrium population size, is given by 1/ n Yt while it is a constant 1/n in population genetics framework. Second, (3.8) shows that extra ecological processes affect the distribution of the neutral marker since in the left-hand side there is the term 2b(Y t ). This term can be interpreted as the effect of demographic stochasticity, which is not taken into account in population genetics. A trait mutant appears around time 18290, invades and fixes into the population. Before the appearance of this mutant trait, fluctuations in the marker distribution are due to (fast) marker mutation, birth and death stochastic events. At the time when the mutant trait appears, the A-allele frequency is 85%, giving a high probability for an A-allele hitchhike. This is the case in the simulation. After fixation time (around time 18490), the a-allele population is extinct. It is regenerated by mutations of the marker but get extinct three times before taking up around time 19600.

Extensions to co-existing traits
The work of Champagnat and Méléard [7] generalizes the TSS to the case of coexisting trait values, when Assumption 3.1 is relaxed. They define a polymorphic TSS called polymorphic evolutionary sequence (PES) and denoted by (X t ) t≥0 ∈ D(R + , M F (X )). When a mutant trait y appears in a resident population of trait x 0 at time t 1 , either its descendent line is killed with probability 1 − [f (y; x 0 )/b(y)] + , or it survives. In that case, we can have coexistence of y and x 0 when there is a positive globally stable non-trivial equilibrium (n * x 0 ,y , n * y,x 0 ) to the Lotka-Volterra system defined in (A.4). Therefore the population jumps from X t 1 − = n x 0 δ x 0 to For a probability π, a trait measure X and x ∈ X , let us denote by F t (π, x, X, du) the Fleming-Viot process started at π, evolving in the trait distribution X and parameterized by x.
Let π 0 be the initial marker distribution of the monomorphic population of trait x 0 . Before the time t 1 of appearance of the first mutant, the marker distribution evolves as (F t (π 0 , x 0 , n x 0 δ x 0 , du)) t≥0 . Let π t 1 = F t 1 (π 0 , x 0 , n x 0 δ x 0 , du) be the marker distribution at t 1 and let V 1 be a random variable drawn in the distribution π t 1 . After t 1 and before the occurence of the second trait-mutation at t 2 , the population evolves as The processes F t (π t 1 , x 0 , X t 1 , du) and F t (δ V 1 , y, X t 1 , du) are generalizations of the Fleming-Viot process defined in Definition 3.3. Indeed their semimartingale decompositions are respectively: where M 1 (φ) and M 2 (φ) are independent square integrable martingales such that At time t 2 , when a third trait appears in the population, the system can evolve to three two or just one coexisting traits, depending on the new trait equilibrium of the Lotka equations that is reached. For each of the traits, the marker distribution evolves as a generalization of the Fleming-Viot processes above.
Remark 3.8 The above equations show that, when there is coexistence of two traits in the population, the markers in the subpopulations defined by the two traits evolve independently but with parameters depending on the two co-existing traits. Thus, when there is a diversification event in the population, the distribution of the neutral diversity in one of the two subpopulations does not evolve as completely forgetting the other one, as it is usually assumed. The parameters of the underlying Fleming-Viot process depend on the complete trait distribution.
We present in Figure 3

Proof of Theorem 3.4
Let us sketch the proof. In this section, we will suppose that Assumptions 2.1, 3.1 are satisfied and the initial conditions are ν K 0 (dy, dv) = n K 0 δ (x 0 ,u 0 ) (dy, dv) with lim K→∞ n K 0 = n x 0 and sup K∈N * E((n K 0 ) 3 ) < +∞. First, we recall results due to Champagnat et al. [5] that provide the finite marginal convergence of the trait process (X K Kt ; t ≥ 0). We extend these results to obtain the weak convergence of the measures (X K Kt (dx)dt; K ≥ 0) in M F (X × [0, T ]) embedded with the weak convergence topology. This corresponds to the convergence of (X K Kt ; t ≥ 0) in the sense of occupation measures, as developed by Kurtz [29]. Secondly, we include the fast component (the marker) and prove the tightness of the sequence (ν K Kt (dx, du)dt; K ≥ 0) in M F (X ×U ×[0, T ]). We then consider a subsequence, again denoted by (ν K Kt (dx, du)dt, K ≥ 0) with an abuse of notation, that converges to a limit Γ(dt, dx, du) ∈ M F ([0, T ] × X × U) that we have to identify. This derivation is done in several steps. When a successful mutant appears in the monomorphic population with trait x, the transition period to fixation is to be considered carefully. It has been proved in [4] that these transitions are of order log(K). We prove that during this time interval, the marker distribution in the mutant subpopulation remains a Dirac mass at the value of the initial mutant. This results from the combined effects of small or rare marker mutations, large population and slow take-off of the new mutant population. Then, we show that in a trait monomorphic population with value x, the marker distribution converges to a Fleming-Viot superprocess parameterized by x.

Semimartingale decomposition of ν K
Let us introduce some notation to keep forthcoming formula simple. For ν ∈ M F (X × U) and φ(x, u) ∈ C(X × U, R), we define the (nonlinear) generators B K and D K (ν) such that The process ν K . , φ is a square integrable semi-martingale and we give its characteristics. Proposition 4.1 For a continuous bounded function φ(x, u) on X × U, the process is a square integrable martingale with previsible quadratic variation

Convergence of the trait-marginal in the trait mutation time scale
As previously emphasized, the trait dynamics is described by the measure-valued process X K which does not depend on the markers. This process has been fully studied in [4,5].
In this section, we recall the finite marginal convergence result obtained in these papers. We give some additional properties concerning the topology involved. This result shows a time scale separation with successive fixations of successful mutants, under Assumptions 2.1 and 3.1. Notice that the time scale assumption is which is realized in our case for p K = 1/K 2 .
Theorem 4.2 Under Assumptions 2.1 and 3.1, let us also assume that the initial population is trait-monomorphic: X K 0 = n K 0 δ x for x ∈ X and n K 0 → n x in probability and sup K∈N * E((n K 0 ) 3 ) < +∞. Then, the sequence (X K Kt ; t ≥ 0) converges to the pure jumps singleton measure-valued Markov process ( n Yt δ Yt ; t ≥ 0) defined as follows: Y 0 = x, and the process Y jumps from The convergence holds in the sense of finite dimensional distributions on M F (X ) equipped with the topology of total variation.
This theorem has been proved in Champagnat [4] for the logistic case and generalized in [5].
The trait-marginal process (X K Kt ; t ∈ [0, T ]) does not converge in D([0, T ], M F (X )) embedded with the Skorokhod topology. Indeed, the size of jumps is upperbounded by 1 K and nevertheless the limiting total mass process has jumps, preventing trajectorial tightness (at least in the J1-topology). Following the idea of Kurtz [29] and as developed in Méléard and Tran [35] and Gupta et al. [23], a weaker topology consists in forgetting the process point of view and considering the measure X K Kt (dx)dt in M F ([0, T ] × X ) embedded with the topology of weak convergence. This convergence in the sense of occupation measures strengthen the result of Theorem 4.2 but in a topology weaker than the Skorohod topology. To achieve this, as in Collet et al. [10], we first introduce the M 1 -topology on D([0, T ], R + ). It is weaker than the usual J 1 -topology and allows monotonous processes with jumps tending to 0 to converge to processes with jumps (see Skorokhod [44]). For a càdlàg function h on [0, T ], the continuity modulus for the M 1 -topology is given by Note that if the function h is monotone, then w δ (h) = 0.
converges in law in the sense of the Skorohod M 1 -topology to the process Proof Assume that g is non-decreasing. From Theorem 4.2, finite dimensional distributions of (R K t , t ∈ [0, T ]) converge to those of ( n Yt g(Y t ), t ∈ [0, T ]). By [44] Theorem 3.2.1, it remains to prove that for all η > 0, where w δ has been defined in (4.6). The mutation rate in (R K t , t ∈ [0, T ]) being bounded, the probability that two mutations occur within a time less that δ is o(δ). It is therefore enough to study the case where there is at most one mutation in the time interval [0, δ]. Following Champagnat [4], the path of R K can be decomposed into several subpaths, each of them being closed to a large population deterministic measure-valued function ξ (See Proposition A.1 in the appendix) with a probability tending to 1. Away from invading mutations and for a trait-monomorphic population with trait x, ξ Kt , g = g(x)n Kt (x) where n . (x) is the solution of the logistic equation (3.1). We can easily check that t → n t (x) converges monotonously to its stable equilibrium n x and then ξ Kt , g is monotonous and the modulus of continuity tends to 0. Around an invading mutant y , ξ Kt , g is close to n Kt (x)g(x) + n Kt (y)g(y) where (n t (x), n t (y)) is solution of the Lotka-Volterra system (A.4) with an initial condition close to ( n x , 0). The mutant y invades if the fitness function f (y; x) is positive (and f (x; y) is negative). From Assumption 3.1, an easy study of the Lotka-Volterra system (see for example the appendix in Champagnat [3], Figure (b) p.187), shows that either n t (x) and n t (y) are increasing orṅ t (x) < 0;ṅ t (y) > 0. In that case we can write d dt (n t (x)g(x) + n t (y)g(y)) = (g(y) − g(x))ṅ t (y) + g(x)(ṅ t (y) −ṅ t (x)) ≥ 0 since g is monotonous. Therefore the function ξ Kt , g is increasing for K large enough and the same conclusion holds. Proof It is enough to prove the convergence in law of h(t)e −qx X K Kt (dx)dt to h(t)e −qxn Yt δ Yt (dx)dt for a measurable bounded function h and q ∈ Q. In [44], it is proved that if x K converges to x in D([0, T ], R) embedded with the M 1 -topology, then for t outside a denumerable set, x K (t) converges to x(t). Then it follows by Lebesgue's Theorem that T 0 H(t, x K (t))dt converges to T 0 H(t, x(t))dt, as soon as H is bounded and continuous. We apply this result to the process ( X e −qx X K Kt (dx), t ≥ 0) and the function H M (t, y) = h(t)(y ∧ M ), for any M > 0. Estimate (2.9) (with p = 1) allows to conclude.

Marker distribution in a new adaptive trait mutant population
In this section, we study the transition of the marker distribution when a new mutant adaptive trait appears in a monomorphic population with trait x 0 . We consider this phenomenon at the ecological time scale and we prove that the fixation of the mutant trait creates a genetical bottleneck.
Let K be fixed. Initially we have a trait monomorphic population with trait x 0 and a marker distribution π K (x 0 , du). Then an individual (x 0 , v) from this population gives birth to an individual with mutant trait y and marker v (v has been chosen according to π K (x 0 , du)). We consider the process (ν K t ; t ≥ 0) started at Proposition 4.5 Under Assumptions 2.1 and 3.1, let us consider a mutant (y, v) appearing in a monomorphic population with trait x 0 and marker distribution π K 0 (x 0 , du). Let us assume that f (y; x 0 ) > 0, where the fitness function has been defined in (3.3). There exists ε > 0 such that for any sequence (t K ; K ∈ N * ) with lim K→+∞ t K / log K = +∞ and lim K→+∞ t K /K = 0 (for example t K = (log K) 2 ), we have Further, for the marker distribution, we can prove that The equation (4.8) tells us that when the mutant trait survives in the resident population of trait x 0 , then by the time t K it needs to reach a non-negligible size, its marker distribution is still a Dirac mass at y. Additional comments are given after the proof.
where M K,g is a square integrable martingale with previsible quadratic variation: (4.10) The third term in the right hand side of (4.9) is of order t K /K. Indeed thanks to (2.4) and (2.5), it is upper bounded by Similarly, the second term of (4.10) tends to 0 as t K K . The first term needs more attention. As soon as the mass ν K s , 1l y of the mutant population is of order 1, the variance of M K,g t K is in t K /K which tends to zero when K → +∞. However, since we start from 1 individual, we have to separate the time interval [0, t K ] into 2 parts. Let us introduce a sequence (s K ) such that s K ≤ t K for any K and Notice that s K can be equal to t K . Using Assumption 3.1, we can prove as in [4,Lemma 3] that there exists ε 0 > 0 such that It turns immediately out that Before time s K , the population size with trait y is not large enough and 1 K ν K s ,1ly can only be upper bounded by 1. Therefore we have to control the expectation of the variance of g under π K s . The expected number of marker mutations at time s along a lineage is sq K and the variance of such mutation is bounded by g 2 ∞ = sup{g(h) 2 , h ∈ U}. Then and E 1 K This concludes the proof.
Remark 4.6 For q K = 1/ √ K, let us notice that the rate of appearance of mutant markers in a population of size K is of order Kq K = √ K which does not tend to zero. This means that many mutant markers appear in the population of trait y during the t K time interval following the first mutant (y, v). However, heuristically, since in a tree the mass is concentrated around the leaves, the mutants do not appear with the same probability along the time interval and mutations are mostly observed after the time s K when the mutant population (y, v) is already large. Moreover, using that the marker mutation step and/or marker mutation frequency is small we obtain that the mutant markers remain in negligible proportion between s K and t K .

Convergence of the marker distribution process in a trait-monomorphic population
For K ∈ N * , we introduce, as in [4], the following sequence of stopping times τ K k and θ K k : The times τ K k 's are the times of appearance of the successive mutant traits in the population and the θ K k 's are the times at which the population returns to monomorphic state. These times are possibly infinite, if the corresponding sets are empty. It has been proved in [4] that for t K be such that lim K→+∞ t K / log(K) = +∞ and lim K→+∞ Kt K = 0, Proposition 4.7 Take the process (ν K Kt ; t ∈ [0, T ]) started with the monomorphic initial condition ν K 0 (dx, du) = n K 0 δ (x 0 ,u 0 ) (dx, du), where lim K→+∞ n K 0 = n x 0 > 0 and sup K∈N * E((n K 0 ) 3 ) < +∞. (i) In the trait-mutation time scale, the time of first mutation converges in distribution as follows: where τ 1 is an exponential time with parameter b(x 0 ) n x 0 .
(ii) Let us consider the processes (π K K(t∧τ K 1 ) ; t ∈ [0, T ]) stopped at the time of first mutation. When K → +∞, this sequence converges in distribution in D([0, T ], P(U)) to the Fleming-Viot process F u 0 (x 0 , du) (see Definition 3.3) and stopped at the independent exponential time τ 1 .
Notice that no mutation can be seen at this scale. To see the trait mutations, we have to consider the process at the mutation scale Kt (cf. [5]).
It can be seen that the conditional distributions of the marker, given the trait x 1 or x 2 remain constant.
Proof The convergence in large population of (ν K ; K ∈ N * ) is a consequence of Proposition A.1.