The Origin of Species by Means of Mathematical Modelling

Darwin described biological species as groups of morphologically similar individuals. These groups of individuals can split into several subgroups due to natural selection, resulting in the emergence of new species. Some species can stay stable without the appearance of a new species, some others can disappear or evolve. Some of these evolutionary patterns were described in our previous works independently of each other. In this work we have developed a single model which allows us to reproduce the principal patterns in Darwin’s diagram. Some more complex evolutionary patterns are also observed. The relation between Darwin’s definition of species, stated above, and Mayr’s definition of species (group of individuals that can reproduce) is also discussed.


3 Evolution of Species and Darwin's Diagram
Emergence and evolution of biological species continue to attract attention from both the biological point of view and from the modelling point of view. Modern theory distinguishes three main mechanisms of speciation. Allopatric speciation implies the existence of geographic or genetic isolation where there is no gene exchange between different taxa; parapatric speciation admits partial exchange; and sympatric speciation occurs without geographic or genetic barriers (Coyne and Orr 2004;Gavrilets 2004). It is generally accepted that allopatric speciation is biologically realistic, though the mechanism which leads to the appearance of isolating barriers may sometimes be unclear [Coyne and Orr 2004 (Chapter 2)]. Sympatric speciation continues to instigate intensive discussions. In spite of the big body of experimental data, observations in nature, and theoretical models, this question remains controversial [ (Coyne and Orr 2004 (Chapter 4)]. Various models of sympatric speciation are developed in Desvillettes et al. (2008), Dieckmann and Doebeli (1999), Doebeli and Dieckmann (2000), Genieys et al. (2008), Genieys et al. (2006a, b) and Méléard (2011).
Competition for resources and selection of the most successful in this competition are among the fundamental properties of biological systems. Once the elements of such a system begin to compete, they try to get as many resources as possible. For this purpose they consume resources not only at the spatial location, where they are located, but also in some areas around it. This results in nonlocal consumption of resources and intraspecific competition. Nonlocal consumption of resources, reproduction, and variation of functional traits (diffusion) determine the process of speciation (Genieys et al. 2006a).
In this work we will address the question about the emergence and evolution of biological species from the point of view of Darwin's diagram. It is the only figure featured in his famous book "The origin of species" (Darwin 2004), and it is the quintessence of his theory, which corresponds, in the modern terminology, to sympatric speciation. We will attempt to give a more precise mathematical interpretation of his theory and to point out the assumptions which are necessary in order to reproduce Darwin's diagram. He described this figure as follows. "Let A to L represent the species of a genus large in its own country; these species are supposed to resemble each other in unequal degrees, as is so generally the case in nature, and is represented in the diagram by the letters standing at unequal distance ... The little fan of diverging dotted lines of unequal length proceeding from (A), may represent its varying offspring. The variations are supposed to be extremely slight, but of the most diversified nature, they are not supposed all to appear simultaneously, but often after long intervals of time; not are they all supposed to endure for equal periods. Only those variations which are in some way profitable will be preserved or naturally selected. And here the importance of principle of benefit being derived from divergence of character comes in; for this will generally lead to the most different or divergent variations (represented by the outer dotted lines) being preserved and accumulated by natural selection. ... After a thousand generations, species (A) is supposed to have produced two 1 3 The Origin of Species by Means of Mathematical Modelling fairly well-marked varieties, namely a 1 and m 1 . These two varieties will generally continue to be exposed to the same conditions which made their parents variable, and the tendency to variability is in itself hereditary, consequently they will tend to vary, and generally to vary in nearly the same manner as their parents varied. Moreover, these two varieties, being only slightly modified forms, will tend to inherit those advantages which made their common parent (A) more numerous than most of the other inhabitants of the same country" (Darwin 2004, p. 104).
Thus, each line corresponds to one species or subspecies. The horizontal axis measures their resemblance; that is, some morphological characteristics or phenotype. The vertical axis is time. Different species are represented as lines (trajectories) in the (x, t)-space.
The appearance of new species is shown as several lines starting from the same point. In the beginning, these are small variations inside the same species. After some time they can become different species. Among many possible variations, only favorable ones are chosen and reinforced by natural selection. The others will disappear. One more important feature of the diagram is that the lines (species) can disappear when two of them approach each other. We can suppose that this happens because the corresponding species begin to compete for the same resources.
We will model the main elements of Darwin's diagram: existence of species as localized groups of individuals, their evolution (motion in the morphological space), emergence of new species, preservation of the most fitted species and disappearance of others due to natural selection, and competition of species for resources. We will introduce the model in the next section and will analyze it in Sect. 3. We will then discuss the results in Sect. 4.

Model
We consider the diagram in Fig. 1 (left) from the bottom upwards (in the direction of growing time). It begins with 11 lines noted by the letters A, B, ..., L. We will mainly consider the left part of the diagram (A, B, ..., F), though some other parts will also be mentioned when it is appropriate. Following Darwin's definition, we understand biological species as groups of morphologically similar individuals. A typical population density distribution for one species is shown in Fig. 1 (right). The variable x here is a morphological characteristic or a functional trait of the population (height, weight, beak size, and so on) and the function u(x, t) describes the density distribution with respect to this characteristic at some given moment of time t. The species are characterized by narrow distributions, such that they can be described by the coordinate x m of the maximum of the function u(x, t). This coordinate x m (t) , considered as a function of time, represents a line on the (x, t) plane. Darwin's diagram corresponds to the ensemble of such lines for different species.
Thus, our aim is to develop a model such that a) it has solutions in the form of pulses (or peaks), which describe population density distributions, and b) the coordinates of the maxima of these pulses on the (x, t)-plane form the patterns qualitatively similar to those in the diagram.
In Bessonov et al. (2014) we reproduced some parts of the diagram independently of each other. In this work we will develop a new model which will allow us to describe these different parts together and to reproduce the left half of the diagram. We consider 6 species in accordance with the letters A, B, ..., F and introduce the corresponding population densities u i (x, t) , i = 1, 2, ..., 6 . In order to describe the desired behavior of solutions, we consider the integro-differential system of equations: Here, the space variable x corresponds to some morphological characteristics of the populations, t is time, and d i , i = 1, ..., 6 are the diffusion coefficients, a i and are some positive constants. The first terms in the right-hand side of these equations describe small variations of the morphology due to random mutations, the second terms describe the reproduction of the population and the last terms species' mortality. Let us note that the reproduction terms are proportional to the square of the population density (Supplementary materials) and to available resources W i , considered in the following general form: Here, b ij are non-negative numbers. The coefficients s ij ≥ 0 depend, in general, on the space variable x, though we will begin below with the simpler case of constant coefficients. The kernel (x) of the integral determines the efficacy of this consumption as a function of the distance x − y . We take it as a piece-wise constant function, In the limit of small N, we get the case of local consumption. In the case of global consumption, the consumption of resources is proportional to the total population density (integrals I 0 i ) (Bessonov et al. 2014). In conventional models of population dynamics, available resources are considered in the form (1 − u i (x, t)∕K) , where K is carrying capacity. Consumption of resources is proportional to the population density u i (x, t) taken at the space point x.
In the case of nonlocal consumption of resources, it is replaced by the integral J i (u) (Genieys et al. 2006a). This implies that the individuals of the population consume resources in some areas around their average location. If there are several species, then all of them (or only some of them) can contribute to the consumption of some particular resources.
The model considered in this work contains two resources: one with nonlocal consumption and another with global consumption. We will see below that both are necessary in order to describe the patterns shown in the diagram. It is a generic model of the interaction of 6 species. It generalizes the previously considered models with local and nonlocal consumption of resources (see Supplementary materials).

Modelling of the Diagram
Darwin's diagram illustrates different evolution patterns, such as speciation, survival of the better adapted, competition of species, among others. In the model presented above, we set b 11 = b 1 , b 1j = 0, j = 2, ..., 6 , b ij = b 2 , i = 2, ..., 6, j = 1, ..., 6 , s 11 = s 1 , s ii = s 2 , i = 2, ..., 6 , s ij = 0 for i ≠ j . These assumptions on the parameters signify that the first equation for the variable u 1 (x, t) is independent of the other equations but it does influence the other variables. Other values of parameters will be specified below.
Speciation The main issue of Darwin's theory is the emergence of new species. This is shown in the diagram on many occasions: points A, a 1 , a 2 , ..., m 1 , m 2 , ..., I, z 1 , z 2 , ... For each point, there are several lines starting from a single point. These lines do not correspond to fully formed species, but only to small variations in the existing species, which can lead to new species in the future.
Among the species A, B, ..., F, only the first one manifests speciation and leads to the appearance of new species. The corresponding function u 1 (x, t) satisfies the equation (2) In order to simplify the model and the analysis of the results, we chose this equation to be independent of other variables. Clearly, similar results can be obtained if the dependence on other variables is weak enough. As discussed above, we characterized solutions of this equation by the positions of their maxima at the (x, t)-plane. They correspond to the lines (species) in Darwin's diagram. A typical branching pattern obtained in numerical simulations of Eq.
(2) is shown in Fig. 2 (the left part of the graph). The initial condition for the first species, u 1 (x, 0) , is a piece-wise constant function different from zero in a narrow interval at the center of the computation domain. This initial population splits into two sub-populations (Fig. 2, right), then again and again, expanding in space as a periodic travelling wave and leaving stationary localized peaks behind it. These peaks correspond to the new species.
Speciation can be described by the model with nonlocal consumption of resources (see Supplementary materials) and by the model with two resources: nonlocal and global. The question about the possibility of sympatric speciation was intensively debated in the literature, and various mathematical models were developed (Atamas 1996;Desvillettes et al. 2008;Dieckmann and Doebeli 1999;Doebeli and Dieckmann 2000;Genieys et al. 2006a, b).
Survival of the better adapted We observe in the diagram that among several new sub-species only one or two survive, while the others disappear after a relatively short time. The model of speciation with nonlocal consumption does not describe this effect: once a species appears, it persists. This contradicts Darwin's description that only better adapted species survive. Usually, the outer lines (species) persist, while the species located inside become extinct (cf. Sect. 1).
If we accept the assumption that some species (or sub-species) are better adapted than the others, then their natality and mortality rates depend on the phenotype, that is on the space variable x. Hence the coefficients of the equations depend on the space variable. This approach was suggested in Bessonov et al. (2014). We will use it here in a more complete model. We illustrate modelling with space-dependent coefficients for another part of the diagram starting with z 6 Bessonov et al. (2014). Figure 3 shows an example of numerical simulations of equation (2) with a space dependent function s 1 (x) . The initial condition is a narrow pulse located at the center of the interval. Due to speciation, it gives six pulses. The outer two of these pulses reach the points where the function s 1 (x) decreases (step-wise constant function in the same figure). As a consequence, these two outer pulses begin to grow while the others disappear. A similar behavior repeats several times at each step of the function s 1 (x) . We obtain a specific V-pattern, which is one of the main repeating patterns in the diagram. Similar patterns start at the points A, a 3 , a 5 , m 4 , z 4 , z 6 .
Let us emphasize the total size of the population is limited due to global consumption. Therefore, if one of the species grows in size, then the others will decline. This is the mechanism which ensures development of the most adapted to the prejudice of the others. If we consider for example only nonlocal consumption, where the size of the population is not limited, then this mechanism does not work.
The absence of speciation Species B, E, ..., F persist without speciation. Such behavior is observed in the case of global consumption (see Supplementary materials). Hence, the behavior of the variables u 2 , ..., u 6 is primarily determined by global consumption while nonlocal consumption takes into account competition with other species (the next paragraph).
Competition of species Another repeating pattern in the diagram is extinction of one of the species when two of them approach each other. Among them, species B, C and D disappear when they meet the line m descendant from A. It also happens for G and u and in several other cases. Thus, we need to describe the situation where two species coexist if they consume different resources and one of them disappears when they begin to compete for the same resources. Since we associate species to the coordinates of the maxima of the corresponding density distributions, two species do not compete if these coordinates are different, and they begin to compete when they are close to each other.
Thus we need to describe (a) the displacement of species in the morphological space, (b) their competition when two species (coordinates of the maxima) are close and the absence of competition when they are far from each other in this space, (c) the disappearance of some of the species due to this competition. Clearly, these properties are not described by the conventional model of competition of species which does not take into account their location in the space of phenotypes.
Let us begin with the case where the coefficients s i are constant. The first component u 1 of the solution expands in space and begins to influence other components of the solution (Fig. 2). First, it meets the second component u 2 , which is bounded from the other side by the third species and is rapidly going to extinction. Similarly, the third, fourth and fifth species disappear one after another when pushed from one side by the expanding first species and from the other side by the remaining species. The last, sixth species persists during the time interval shown in the diagram.
The introduction of space dependent coefficients allows us to take into account the survival of the better adapted discussed above. Fig. 4 (upper left) shows the initial conditions and the function s 1 (x) . The results of the numerical simulation are shown in Fig. 4 (upper right). Time-dependent coefficients In order to finish modelling Darwin's diagram, we need to introduce time dependent coefficients. Let us explain why it is necessary. The pulse moves in such a way that it tends to improve living conditions of the population: decrease the mortality rate, increase natality and the total size of the population. These conditions determine the direction of its motion, and it cannot move in the opposite direction because it would be unfavorable for the population. These observations are in agreement with the biological meaning of the considered models, and they are confirmed by all obtained results.
Consider the line from the point A to a 5 of the diagram which moves to the left. However, the line f, which starts at the point a 5 , turns to the opposite direction. If the parameters of the model remain the same, this should not happen because either the line a or the line f would deteriorate the life conditions of the corresponding species. Similar behavior is shown in other parts of the diagram ( m 4 , z 4 , z 6 ). Hence we need to conclude that at the moment when the curve a reaches the point a 5 , the coefficients of the equation change. Biologically, this means that the new species possess new properties which are reflected through the corresponding changes in the coefficients of the model. Figure 4 shows the results of numerical simulations of system (1). The initial functions u i (x, 0) and the initial function s 1 (x) are shown in the upper left image. The result of this first stage of the simulation is shown in the upper right image together with the function s 1 (x) , which is used in the next stage. Further results and the function s 1 (x) are shown in the other images. The final result is shown in Fig. 5 (left). It qualitatively repeats the main features of this part of the diagram (Fig. 5, right).

Discussion
Darwin's diagram represents the essence of his theory, and it contains several different evolutionary patterns. First of all, the emergence of new species, the question largely discussed in the modelling literature, and also several other models  Darwin's diagram (right) and the numerical simulations (left) show its main patterns: coexistence of several species, V-pattern, extinction of species due to competition, time dependent coefficients allowing the species f to return back in the space of phenotypes including survival of the most adapted, existence of species without speciation, and competition of species. Even the question of existence of species, which might seem obvious, appears to be quite involved as far as mathematical models are concerned. All these questions stimulated the development of the new models that were brought together in this work. We discuss some of these below as well as some further possible developments of these models.
Existence of species Biological species correspond to localized population density distributions in the form of pulses. Reaction-diffusion equation (1) (Supplementary materials) with a bistable nonlinearity has such stationary solutions. However, they are unstable. They become stable in the case of global consumption of resources (3). The bistable case corresponds to sexual reproduction where the reproduction rate is proportional to the square of the population density u 2 . The existence of such solutions is also proved for the system of two equations with different densities of males and females (Volpert et al. 2015). In the monostable case, where the reproduction rate is proportional to the first power of u, single stationary pulses do not exist. However, there are periodic travelling waves which can leave multiple pulses after propagation. Thus, the existence of biological species, considered as particular types of solutions, imposes certain conditions on the model and on the parameters.
Emergence of species and survival of most adapted One of the main issues of Darwin's theory concerns sympatric speciation. In our previous work (Genieys et al. 2006a) we have developed a minimal reaction-diffusion model that describes this process. It includes nonlocal consumption of resources, which leads to intraspecific competition, reproduction, and small random mutations. In the case of sexual reproduction, the phenotypes, or the functional traits, related to consumption of resources of males and females are supposed to be the same. Their difference can influence the emergence of species (see the next paragraph).
The minimal model of the emergence of species is not sufficient to describe some properties of this process stressed by Darwin; namely, survival of the most adapted subspecies and disappearance of the others. In order to describe this effect, we introduce a model with two resources: nonlocal and global. It is interesting that, similar to Darwin's description, we observed that "the most different or divergent variations" survive while the others go to extinction.
Darwin versus Mayr Darwin considered biological species as groups of morphologically similar individuals. According to Mayr, these are groups of individuals that can reproduce among themselves and have fertile offspring. We will briefly discuss here how these two definitions are related to each other from the point of view of the modelling presented in this work.
The reproduction terms in Eq.
(1) are proportional to the second power of the population density. This term appears in the case of sexual reproduction where the reproduction rate depends on male and female densities which are supposed to be equal to each other (see Supplementary materials). Therefore, it is implicitly assumed that males and females having offspring possess the same functional trait that is responsible for consumption of resources.
Suppose now that males and females having common offspring can possess different phenotypes. Then, instead of u 2 , we have the nonlocal term 1 3 The Origin of Species by Means of Mathematical Modelling u(x, t) ∫ ∞ −∞ (x − y)u(y, t)dy , where the kernel (x − y) shows the probability of phenotypes x and y to have common offspring. This model was studied in , and it was shown that the new species will appear if the support of is small enough, such that the phenotypes x and y with common offspring belong to the same localized solution (species). If we allow inter-breeding, that is, the support of is larger than the distance between the species in the space of phenotypes, then new species do not appear.
Hence, Mayr's condition is necessary for the emergence of species that are understood in the sense of groups of morphologically similar individuals. Indeed, if individuals from different species can have common offspring, then the groups of morphologically similar individuals will not emerge.
Multi-level speciation Darwin's diagram contains many important elements of his theory. However, it is only an illustration, and many other cases and generalizations are possible. We will present here one of them in order to show an example of multi-level speciation.
Biological species are organized in genera, families, and so on. In the modelling presented above, we considered only species and not higher levels of this hierarchy. In order to introduce two levels of speciation, we need to consider double nonlocal consumption of resources with two different kernels. In this case, the modified model is as follows: where Figure 6 shows an example of numerical simulations of equation (3). The maxima of solutions, which correspond to different species, are organized in groups that correspond to genera. Genera have a larger range of consumption of resources than species because of a greater variation of the total phenotype.
Nonlocal prey-predator models Competition of species leads to a quite simple dynamic where one of the species disappears or the two species coexist. It is interesting to note that the coexistence of species is not shown in Darwin's diagram. The situation is quite different if the interaction of species is determined by prey-predator dynamics with nonlocal consumption of prey by predator (Banerjee and Volpert 2016) or with nonlocal consumption of resources by prey . In these cases, numerous stationary, time periodic, or aperiodic regimes are observed.