An a posteriori species clustering for quantifying the effects of species interactions on ecosystem functioning

Quantifying the effects of species interactions is key to understanding the relationships between biodiversity and ecosystem functioning but remains elusive due to combinatorics issues. Functional groups have been commonly used to capture the diversity of forms and functions and thus simplify the reality. However, the explicit incorporation of species interactions is still lacking in functional group‐based approaches. Here, we propose a new approach based on an a posteriori clustering of species to quantify the effects of species interactions on ecosystem functioning. We first decompose the observed ecosystem function using null models, in which species diversity does not affect ecosystem function, to separate the effects of species interactions and species composition. This allows the identification of a posteriori functional groups that have contrasting diversity effects on ecosystem functioning. We then develop a formal combinatorial model of species interactions in which an ecosystem is described as a combination of co‐occurring functional groups, which we call an assembly motif. Each assembly motif corresponds to a particular biotic environment. We demonstrate the relevance of our approach using datasets from a microbial experiment and the long‐term Cedar Creek Biodiversity II experiment. We show that our a posteriori approach is more accurate, more efficient and more parsimonious than a priori approaches. The discrepancy between a priori and a posteriori approaches results from the way each clustering is set up: a priori approaches are based on ecosystem or species properties, such as ecosystem size (number of species or functional groups) or species’ functional traits, whereas our a posteriori approach is based only on the observed interaction and composition effects on ecosystem functioning. Our findings demonstrate that an a posteriori approach is highly explanatory: it identifies who interacts with whom, and quantifies the effects of species interactions on ecosystem functioning. They also highlight that a combinatorial modelling of ecosystem functioning can predict the functioning of an ecosystem without any hypothesis about the biotic or environmental determinants or any information on species functional traits. It only requires the species composition of the ecosystem and the observed functioning of others that share the same assembly motif.


Accepted Article
This article is protected by copyright. All rights reserved. Garnier et al. 1997;Kirwan et al. 2009). These methods have successfully explained biodiversity effects in several biodiversity-ecosystem functioning experiments Reich et al. 2012), but they have not quantified the net effect of species interactions nor identified the species that really interact. The quantification of all possible pairwise species interactions is practically infeasible in species-rich ecosystems because of the curse of dimensionality (McGill et al. 2006). The difficulty of this task lies in the fact that myriad different types of positive and negative species interactions can simultaneously occur within ecosystems. Several attempts have been proposed to frame a pragmatic and operational approach to quantify ecosystem-wide species interactions. The trait-based characterization of species' competitive abilities i.e. the search for organism traits that can explain how species' functions influence the performance of competing neighbours (Grace 1990), paved the road to a generic description of species interactions. In the last decade, this approach has been widely used to study species interactions (the "interaction milieu" sensu McGill et al. 2006) through the systematic evaluation of the community-level distribution of interaction traits. However, interaction traits are still unknown in most taxa (Violle et al. 2014).
Even if functional-group approaches simplify reality, they offer an operational and parsimonious way to analyse and model ecosystem functioning. Nevertheless, up to now this approach has mostly relied on a priori clustering based on expert knowledge, as illustrated by the widespread classification of plants into three groups: legumes, graminoids and forbs. An a posteriori clustering may be promising since it will identify functional groups based on realized effects, for instance based on the way interacting species modulate ecosystem functioning. Such clustering has hardly been applied so far. A notable exception is the study of Wright et al. (2006) which searched for the best species clustering to predict ecosystem biomass. More recently, Jaillard et al. (2014)

Accepted Article
This article is protected by copyright. All rights reserved. relationships reported in the literature but its flexibility remained low because it is based on a priori prevalent assembling rules that determine ecosystem functioning. Here we extend Jaillard et al. (2014)'s model by proposing an a posteriori clustering approach to the effects of ecosystem-wide species interactions on a given ecosystem function.
The aim of this paper is to quantify the net effects of species interactions on a given ecosystem function. As suggested by Wilson (1988), we first decompose the observed ecosystem function using null models in which species diversity does not affect ecosystem functioning. This allows us to separate the effects of species interactions from the effects of species composition on ecosystem functioning. This separation makes it possible to a posteriori identify functional groups (sensu Diaz &Cabido 2001 andGarnier 2002) that have contrasting interaction and composition effects on the ecosystem function. Then we propose a formal combinatorial framework to describe an ecosystem as a combination of co-occurring functional groups, which we call assembly motif, echoing the network motifs in network theory (Milo et al. 2002). Each assembly motif parsimoniously accounts for the observed effects of species interactions and composition on ecosystem functioning. We test this approach using two datasets: one microbial diversity experiment (Langenheder et al. 2010) and the Cedar Creek Biodiversity II experiment (TiIman et al. 2001). We demonstrate that our approach can identify (the functional structure of ecosystems, i.e.) who is interacting with whom, and quantify the effects of species diversity on ecosystem functioning.

Separating the effects of species interactions and species composition on ecosystem functioning
Consider a set S = { 1,…,s } of s species i (Figure 1a). An ecosystem is defined as an

Accepted Article
This article is protected by copyright. All rights reserved.
assemblage A of individuals that belong to different species: A cumulative function F observed (A) such as primary production, respiration or nutrient recycling is observed for each ecosystem A. The function F observed ({i}) (with i = 1,…,s) for monocultures of each species is also observed.
We set two null hypotheses, named H 0 and G 0 respectively ( Figure S1).  (Wilson 1988): . (1) The null hypothesis G 0 describes a situation where neither the species interactions nor the species composition affect the ecosystem function. Under G 0 , the expected function

Accepted Article
This article is protected by copyright. All rights reserved. . (2) The function F observed (A) of an ecosystem can then be decomposed as follows: . or: ( 3) with: (4) and: .
The quotient (A) corresponds to the effect of species interactions on the ecosystem function. This interaction effect is specific to an ecosystem. It equals to one under H 0 , is lower than one when the species interactions decrease the ecosystem function, and higher than one when the species interactions increase the ecosystem function. The normalized remainder (A) corresponds to the effect of species composition of the ecosystem on its Version postprint Comment citer ce document : Jaillard, B., Richon, C., Deleporte, P., Loreau, M., . An a posteriori species clustering for quantifying the effects of species interactions on ecosystem functioning. Methods in Ecology and Evolution, 9 (3), 704-715. , DOI : 10.1111/2041-210X.12920

Accepted Article
This article is protected by copyright. All rights reserved.
function. This composition effect characterizes the subset of species that belong to the ecosystem, relatively to the whole set of species. It equals to one under G 0 , is lower than one when the species composition decreases the ecosystem function, and higher than one when the species composition increases the ecosystem function. The composition effect is one on average when all the possible ecosystems are tested. Finally, the remainder F expected/G0 (A) is by definition constant: this is a scale factor specific to the species set used in the experiment, i.e. the function expected with neither interaction nor composition effects.
The scale factor integrates all the variations in species functions in the experiment (e.g., variation in environmental conditions across time). It has the dimension of the ecosystem function. Interaction and composition effects are dimensionless.

Analysing the diversity effects induced by each species on ecosystem functioning
The idea is now to cluster ecosystems on the basis of the ecosystem interaction and composition effects. We denote by A the set of ecosystems A observed in the experiment.
We define A i (with i  1,…,s) the cluster of ecosystems A that contains at least one individual of the species i of S: The interaction and composition effects (A i ) and (

Accepted Article
This article is protected by copyright. All rights reserved. and . (7) Introducing the concept of assembly motif: a simple and operational descriptor of the

composition of an ecosystem
Each ecosystem is a subset of individuals that belong to different species: it can also be described as a subset of individuals that belong to different functional groups of species. We cluster the species set S = {1,…,s} into  functional groups S j (with j = 1,…,) on the basis of their interaction and composition effects (A i ) and (A i ) (Figure 1a). The  functional groups S j allow the assembly of m = 2   1 non-empty combinations of functional groups.
We term assembly motif M k (with k = 1,…,m) each combination of functional groups, then: We associate each ecosystem A with an assembly motif M k by assuming that each species i of A belongs to a functional group S j of M k , and that each functional group S j of M k is represented by at least one individual of species i of A. Then, we define A k (k =1,…,m) as the cluster of ecosystems A described by the assembly motif M k :

Accepted Article
This article is protected by copyright. All rights reserved.

Modelling the function of an ecosystem according to its assembly motif
An assembly motif characterizes a particular biotic environment. Next, we cluster ecosystems by assembly motifs, i.e. we cluster ecosystems that share a similar biotic environment ( Figure 1b). We use the ecosystem clustering by assembly motifs to model the ecosystem function. As in equation 6, we define A i,k (with i = 1,…,s) the cluster of ecosystems A of A k that contains at least one individual of the species i of S: According to equation 7, the interaction and composition effects (A i,k ) and (A i,k ) of ecosystems A of clusters A i,k are estimated by: and .

Accepted Article
This article is protected by copyright. All rights reserved.

Accepted Article
This article is protected by copyright. All rights reserved.

Accepted Article
This article is protected by copyright. All rights reserved.
assembly motifs are not observed. When an assembly motif is represented by only one ecosystem, the ecosystem function cannot be independently predicted. We define the predicting ratio as the number of ecosystems for which the function can be predicted by the clustering model, divided by the number of all observed ecosystems.
All the computations are done using the R-software (R Development Core Team 2009).

Biodiversity datasets
We used two datasets to test our approach. The first dataset is based on the observation of all the possible ecosystems assembled with an initial pool of microbial species. Langenheder Here we analyse the xylose oxidation (ecosystem function) after 48 hours for the 63 bacterial ecosystems, of which 6 are mono-species and 57 pluri-species cultures.
The second dataset is the Cedar Creek Biodiversity II experiment dataset ). This experiment was dedicated to the analysis of the biodiversity-productivity relationship in grasslands. Here the studied ecosystem function is the yearly plant aboveground biomass per unit area. The experiment contained 88 ecosystems from a pool of 16 prairie species Reich et al. 2012

Accepted Article
This article is protected by copyright. All rights reserved.
ecosystems in August 2004, of which 35 are mono-species and 53 pluri-species plots. Each biomass is the average value of the harvests of three "unsorted" strips.

Species clustering based on the diversity effects on ecosystem functioning
The

Accepted Article
This article is protected by copyright. All rights reserved.

Accepted Article
This article is protected by copyright. All rights reserved.
inside the largest functional group, and with species of other functional groups, they induce similar diversity effects on the ecosystem functioning. Except the C4-grasses, the functional groups a priori defined by Tilman et al. (2001) are split into several clusters: C3-grasses and forbs into two, and the four legumes into three different clusters.

Modelling the ecosystem functioning based on the ecosystem clustering
Next we focus on the modelling of ecosystem function based on the ecosystem clustering by assembly motifs, i.e. the combination of functional groups of species (see Figure 1). We recall that each species clustering in functional groups is a model, that generates a new set of assembly motifs. Consequently, we explore the model quality, accuracy and efficiency, by increasing step-by-step the number of functional groups (Figure 4). The goodness-of-fit of the model increases from a low value until one when there are as many functional groups as species (Figure 4a and 4b). The model robustness is evaluated by its efficiency: it also increases, but remains always lower than the goodness-of-fit of the model. In contrary, the predicting ratio decreases with the number of functional groups, from one when all the ecosystems are clustered together in a singleton (all ecosystem functions are predicted by the mean function of all ecosystems), until zero when there are as many functional groups as species (no ecosystem function can be predicted because each ecosystem cluster is a singleton) (Figure 4c and 4d). The best number of functional groups results from a trade-off between the accuracy (R 2 ), the efficiency (E) and the predicting ability (predicting ratio) of the model. In the microbial experiment, R 2 increases from 1 to 5 functional groups ( Figure   4a). However, E presents a maximum for 4 clusters of species, which corresponds to a predicting ratio of 53/57 ecosystems (Figure 4c). In the Biodiversity II experiment, R 2 increases, with local maximums for 3, 6 and 8 species functional groups, and the predicting ratio decreases quickly from 1 to 16 functional groups. A species clustering in 3 functional

Accepted Article
This article is protected by copyright. All rights reserved.
groups allows to predict the functions of 52 out of 53 ecosystems.
For a given species clustering, each ecosystem can be described by a unique assembly motif, and its function can be evaluated by modelling ( Figure S3-S4 for the independent modelling of both diversity effects). In the microbial experiment, all the 63 species combinations are observed. Species richness is the most frequently used metric of diversity in biodiversity-ecosystem functioning research (Tilman et al. 1997Hooper et al. 2005;Langenheder et al. 2010;Reich et al. 2012). As a consequence, here we use species number (ecosystem size) as a reference (Figure 5a and 5b). This modelling lead to dispersed and overlapping diversity effects (Figure 5a), but goodness-of-fit (R 2 = 0.438, p = 9.1 10 -7 ) and efficiency (E = 0.245, p = 1.8 10 -3 ) are significant (Figure 5b). Based on a 4group species clustering, the resulting ecosystem clustering by assembly motifs is much more structured (Figure 5c and 5d). The diversity effects of ecosystems that share an assembly motif are close to each other, confirming the functional redundancy of clustered species (Figure 5c). The goodness-of-fit is high (R 2 = 0.892, p < 10 -16 ) and the efficiency of this modelling remains high and highly significant (E = 0.828, p < 10 -16 ) (Figure 5d).

Accepted Article
This article is protected by copyright. All rights reserved.
6e and 6f). Moreover, the function of 52/53 ecosystems can be predicted independently.

Separating the interaction and composition effects: an alternative proposal
The decomposition of ecosystem functions into diversity effects has been successfully used in the past to analyse biodiversity-ecosystem functioning experiments Reich et al. 2012). This method follows an integrative path by computing the covariance between the observed and expected contributions of each species to ecosystem functioning. We here followed the same philosophy but used the ratio between observed and expected yield (so-called "relative yield of the mixture", RYM) instead of the covariance, as suggested by Wilson (1988).

Accepted Article
This article is protected by copyright. All rights reserved.
demonstrate that the interaction and composition effects are independent in both experiments, but the two diversity effects contribute significantly to ecosystem functioning.

The emerging concept of assembly motifs
We propose to quantify the net effect of species interactions on ecosystem functioning through the lens of assembly motifs, i.e. a combination of co-occurring functional groups within an ecosystem. An assembly motif is a pattern of species composition that reflects a given biotic environment. Our results show that ecosystems described by the same assembly motif display diversity effects on ecosystem functioning similar to each other in terms of interaction and composition effects. An assembly motif therefore describes the biotic environment that regulates the effects of species interactions on a given ecosystem function.
The underlying condition of our modelling framework is that the presence/absence of at least one species from each functional group changes the ecosystem-wide species effect in a similar way (Jaillard et al. 2014). This condition is the simplest and the most commonly used in ecology (Diaz & Cabido 2001;Hooper et al. 2005). It implicitly assumes that the presence of particular species, thus of particular interactions, is more important than the number of interactions, the number of species or the number of functional groups. Some authors have long pointed out that the composition of an ecosystem, i.e. the identity of species or functional groups, affects its functioning more than diversity per se (Hooper & Vitousek 1997). Our results confirm the validity of this hypothesis in both studied experiments.
Our modelling framework is based on combinatorics through combinations and clustering of species, functional groups and ecosystems. Clustering of species or ecosystems is a simplification of the real world. It assumes that clustered species are functionally redundant,

Accepted Article
This article is protected by copyright. All rights reserved.
which means in our case that they induce roughly the same effects on the ecosystem functioning. However, the functional redundancy, and thus the species clustering, is specific to the considered experiment, to the set of species used and to the ecosystem function observed in the experiment (Loreau 2004

Identifying and quantifying the interacting roles of species within ecosystems
We clearly showed that an ecosystem clustering based on assembly motifs is much more suitable than the classical clustering based on the ecosystem size, such as the number of species or functional groups. Langenheder et al. (2010) biologically showed that the species SL104 plays a key role on xylose oxidation. Our modelling approach retrieves this key role of SL104, and specifies that the species increases the ecosystem function by composition effect rather than interaction effect. It also shows that the three species SLWC2, SL106 and SL197 are clustered in a same functional group, thus they induce similar diversity effects by interacting between them and with others.
In the Biodiversity II experiment, we highlight that only three a posteriori functional effect

Accepted Article
This article is protected by copyright. All rights reserved.
(2001) are both based on four groups: they explain and predict less accurately the ecosystem biomass than our a posteriori approach. Our a posteriori approach is thus more accurate, more efficient and more parsimonious than both a priori approaches based on ecosystem size and functional groups a priori defined. Surprisingly the four legumes classically clustered into a single functional group (legume) were found in different functional groups based on our a posteriori clustering. Lupinus perennis belongs to the ecosystem cluster with the highest interaction effects. Lupinus perennis is a legume, the leaves and fine roots of which have a very short longevity, lesser than 4 weeks, releasing large amounts of nitrogen in soil (Craine et al. 2002). In Europe, the Lupinus genus is also known among legumes to produce cluster roots that release citrate and protons and mobilize poorly available nutrients such as phosphorus (Hinsinger et al. 2002;Jaillard et al. 2003;Lambers et al. 2013). The Lupinus species from the New World are less studied but Lambers et al. (2012) reported that Lupinus lepidus also produces cluster-like roots. Lespedeza capitata is clustered with a forb (Achillea millefolium) and three C3-grasses (Koeleria cristata, Agropyron smithii and Poa pratensis). The species cluster has high interaction effects but low composition effects. Note that Lupinus perennis and Lespedeza capitata are always isolated in small functional groups and have strong structuring effects on the ecosystem function across years (from 2001 to 2007, data not shown). Amorpha canescens belongs to ecosystems with the highest composition effects but low interaction effects. Our results confirm that legumes play a key role in ecosystem-level biomass production, as previously highlighted by many authors (e.g., Tilman et al. 1997Tilman et al. , 2001, but, as suggested by Craine et al. (2002) and Wright et al. (2006), they also indicate that the 'legume' functional group is not homogeneous in the ability of legumes to affect plant biomass.

Accepted Article
This article is protected by copyright. All rights reserved.

Biodiversity effects: a new conceptual framework
The approach developed here first separates the interaction effect from the composition effect of diversity on an ecosystem function, then clusters species in functional groups on the basis of these diversity effects on ecosystem functioning. However, the novelty is mainly to combine functional groups of species and to assume that any species assemblage, matching to all possible combinations of functional groups, i.e. all possible assembly motifs can significantly affect, positively or negatively, the interaction and composition effects. properties. We here demonstrate that a combinatorial approach also allows the modelling of ecosystem functioning without any hypothesis about its biotic or environmental determinants, and without any information on species functional traits. We only use the available experimental data, i.e. the function and species composition of ecosystems, and the function of species in monoculture. Wright et al.'s (2006) study is the only one to assess random species clustering based on their ability to predict an ecosystem function. The main result obtained by Wright et al. (2006) was that the explanatory powers of a priori and random clustering were not significantly different. In contrast to the conclusion of Wright et al. (2006), our results show that an a posteriori species clustering can be highly explanatory. It even makes it possible to predict with accuracy the functioning of an ecosystem on the sole basis of its species composition.
A posteriori approaches such as ours aim at describing the structure of raw data. In our case, the idea is to determine the functional structure of ecosystems that best accounts for the observed data without attempting to explain the underlying biological processes. This approach allows testing some hypotheses, in particular it allowed us to challenge the

Accepted Article
This article is protected by copyright. All rights reserved. the former to explore data and reveal real patterns, the latter to determine the functional traits or biological processes responsible for the observed species patterns. This combination is best suited to providing robust, general, explanatory and anticipatory predictions of the effects of species diversity on ecosystem functioning (Mouquet et al.

Conclusion
We propose a combinatorial model to quantify the net effects of species interactions on ecosystem functioning and an efficient method to fit species clustering to experimental datasets. When applied to two datasets, a microbial experiment and the Biodiversity II experiment, we show that assembly motifs, i.e. patterns of ecosystem assembly that reflect

Accepted Article
This article is protected by copyright. All rights reserved.

Accepted Article
This article is protected by copyright. All rights reserved.

Accepted Article
This article is protected by copyright. All rights reserved.

Accepted Article
This article is protected by copyright. All rights reserved.

Accepted Article
This article is protected by copyright. All rights reserved.

Accepted Article
This article is protected by copyright. All rights reserved.