Multiscale modeling for bioresources and bioproducts

Designing and processing complex matter and materials are key objectives of bioresource and bioproduct research. Modeling approaches targeting such systems have to account for their two main sources of complexity: their intrinsic multi-scale nature; and the variability and heterogeneity inherent to all living systems. Here we provide insight into methods developed at the Food & Bioproduct Engineering division (CEPIA) of the French National Institute of Agricultural Research (INRA). This brief survey focuses on innovative research lines that tackle complexity by mobilizing different approaches with complementary objectives. On one hand cognitive approaches aim to uncover the basic mechanisms and laws underlying the emerging collective ∗Corresponding author Email address: virginie.hugouvieux@inra.fr (V. Hugouvieux) Authors are listed in alphabetical order.


Introduction
Designing and processing complex matter and materials are key objectives in bioresource and bioproduct research. The complexity in this context emerges from two main sources.
One source of complexity is that when producing foods or biobased products, there are many different length scales to account for, from the molecules to macroscopic behavior and overall environmental conditions, through intermediate mesoscopic or granular particles (Fig. 1). In many cases, these length scales are coupled: for instance microstructure of raw materials strongly influences their mechanical properties, while processing conditions impact the rheological properties of the resulting product. These mutual dependences between length scales can be modeled in different ways and with different purposes in mind, but the key objective when modeling bioresources, foods and bioproducts is to capture the relevant length scales for the target question. Another source of complexity stems from the biological origin of the raw and processed materials as bioresources (or bioproducts at a later stage of processing) show prominent variability (resulting from genetics, crop management, environmental conditions, etc) which affects their chemical composition, their architecture at different scales, and consequently their processability. Moreover, these systems are often heterogeneous, compartmented, multiphasic, or with interfaces. These variability and heterogeneity are challenging for basic disciplines such as physics, mechanics or mathematics, which are commonly used in modeling approaches. Indeed, the strongly reductionist approach they classically use is unable to capture the features responsible for the peculiar behavior of bioresources or bioproducts. Modeling approaches therefore need to bridge the gap between the well-defined monodisperse and pure systems for which well-established models exist, and the real biobased materials which strongly depart from this idealized representation [1,2].
Approaches to modeling biobased systems thus need to account for both their intrinsic multiscale nature and their variability and heterogeneity. Here we focus on two classes of approaches that account for these elements of complexity using very different methods but with complementary objectives.
The first type of approach aims at discovering the basic mechanisms and laws underlying the physical and mechanical behavior of biological material -whether raw or during and after processing -and it can uncover the key parameters responsible for the observed behaviors. For this purpose, different methods can be used which focus on a specific length scale and can be combined to span the whole range of length scales. On one hand, this kind of approach can be based on the description of the individual physical objects (such as macromolecules or grains) from which collective properties emerge [3,4], and on the other hand it can explore the behavior of macroscopic systems from which the mechanisms responsible for the measured phenomena can emerge [5]. This approach borrows tools from physics and mechanics and can be either experimental (examples of phenomenological and heuristic approaches can be found in [6,7]) or numerical (see for instance [8,9]). By contrast, the second type of approach tackles questions dealing with the optimization of processes or the prediction of their outcome, and includes most relevant length scales in the representation of the system. This kind of approach is based on machine learning methods, which synthesize huge datasets from experiments at different scales (with factors such as genetics, physical phenomena, chemical reactions and living organisms, hence accounting for the variability of the materials) combined with expert knowledge [10].
Note that both the cognitive and predictive modeling approaches give insight into certain aspects of biobased systems that would otherwise be inaccessible. For instance, it is possible to deduce the physical mechanism responsible for a given phenomenon by designing a model that can represent the experimentally-obtained data. The reverse can also be done: simulation of macromolecules or grains with features known from the experiments can be used to test the experimental hypotheses and to numerically explore the collective behavior of similar systems. Moreover, the machine learning methods combining experimental and expert knowledge can effectively circumvent the fact that obtaining data in the food industry is no easy task, as experiments are expensive and many factors are involved in the processes [11,10].
In the following we highlight and illustrate three approaches dealing with modeling at different length scales, and with various objectives. The intent of this paper is not to review the field of modeling in food science and biobased products but to give insight into some of the methods we currently use in this context. Section 2 introduces computational physics approaches used to gain insight into the physics of soft-matter systems and granular media, modeled as collections of generic, interacting objects from which a collective behavior emerges. We present case studies using this approach that deal with i) the self-assembly of biopolymers in solution and at interfaces and eventually broken (Fig. 2d).
The collective behavior of the system also depends on the interactions that act as forces and moments on the particles. These interactions can be either attractive or repulsive, short-or long-ranged. Computationally, this is described using forces (such as friction or gravity), some of which derive from potentials (e.g. electrostatic, van der Waals or covalent bonds). Figure 3 shows three examples of classical interaction pair potentials. Potentials may also rule the interactions between more than two objects, like in bond angle potentials (three objects) or dihedral angle potentials (four objects). Note that a tangential or rotational strength may also be considered when friction is present.
In problems where grains are in contact with interstitial phases (considered as continuous at the scale of the grain), the forces are transmitted through the interfaces and effective pair interactions are not realistic in such cases. The rheological behavior of these phases can be either liquid like for suspensions [12] or solid like for composites [13]. Discrete modeling of the grains thus has to be coupled to a specific description of interstitial phases.
For example, the Lattice Boltzmann Method [12,14] provides a versatile framework for the computation of hydrodynamic properties [15] at the subparticle scale (Fig. 4c), the simulation of capillary rise in wood [16] or droplet spreading [17,18] on a powder bed (Fig. 4d).
In the simulations, interacting objects can move in a domain of space.
Boundary conditions are often imposed: they can be periodic in the three r r eq c) Figure 3: Examples of classical potentials. a) Potential for purely repulsive spheres; b) van der Waals potential consisting of a repulsive part (particle core) and a short-range attractive well for r ≈ r eq ; c) Harmonic or higher degree potential classically used for the description of covalent bonds with an equilibrium length r = r eq . dimensions when the focus is on bulk properties (see the snapshots in Fig. 6), they can account for the presence of a surface or they can result from mechanical constraints (Fig. 4b,c). Figure 4a, for instance, depicts the case of a polymer solution in contact with a hydrophobic surface while in Figure 4b shearing and a normal confining pressure are applied at the top and bottom surfaces of the domain.
Whenever Brownian motion is negligible with regard to the inertia of the particles (i.e. when particle size is typically larger than 1 μm) or to the cohesive and confining forces, the evolution of the system will be strongly influenced by its initial state. For example, the flowing and compaction properties or gas permeability of a packing of particles are strongly affected by its initial density.

Computation
Simulations give access to the equilibrium states of a system and/or to its evolution. For this purpose different algorithms can be used that perform iterative calculations of the positions of the objects and usually their velocities and accelerations. The Monte Carlo algorithm, for example, gives access to a set of configurations that represent the thermodynamic equilibrium of the simulated system [19]. Other methods such as molecular dynamics simulations [20], event-driven simulations or contact dynamics [21], consist in solving Newton's equations of motion and give access to the forces between the objects and to their individual trajectories. Whatever the simulation method used, computing time strongly depends on the number of interactions computed at each timestep, which is influenced by their spatial range, the complexity of the objects, and the interaction models.
It is important to note that the details in the shape of real objects and their interactions are often less relevant to the quality of the simulation than the number of simulated objects. Hence it is often necessary to simulate systems consisting of thousands or tens of thousands of objects, which means much effort has to be put into developing efficient algorithms that can run on parallel architectures (multicores, multiprocessors, GPGPU or HPC clusters) [22,23,24].
Computing time also depends on temporal discretization. If the timestep is too long, the calculation is unstable, but if the timestep is too short it may unnecessarily increase computing time. The choice of timestep is guided by the physics of the problem, and should be short enough to accurately describe the fastest motions in the system, which depend on the mass of the objects and on the stiffness of the interactions. For example, in atomistic molecular dynamics simulations, the high energy of covalent bonds requires the use of a timestep of the order of a femtosecond (10 −15 s) which limits the total simulated time to a few nanoseconds. In contrast, when millimeter-sized objects are used as the basic units, the timestep can be of the order of a microsecond.

Analysis of the results
Numerical modeling of discrete objects and simulations can be seen as numerical experiments. Simulations usually produce a huge amount of data.
Their processing is a central part of numerical modeling, and generally requires specific algorithms. A key advantage of the numerical approach is that it can decouple the mechanisms that are intertwined in real experiments. Indeed, discrete modeling deliberately discards part of the complexity of materials, so the mechanisms can be ranked according to their importance in the considered phenomenon. This type of modeling approach also allows the local-scale self-assembly properties to be interpreted, their time evolution to be followed, and the different regimes of structuring or destructuring of matter to be understood and mapped as a function of parameters such as temperature, polymer density, pressure or humidity [25,26]. In the following the discrete modeling and simulation approaches are illustrated through examples.

Selected case studies 2.4.1. Self-assembly and phase-space exploration
The self-assembly of biopolymers plays a major role in a number of applications where they are used as gelling, stabilizing or emulsifying agents. Figure 4: (color online) Some case studies. a) Solution of amphiphilic copolymers (red beads ≡ hydrophobic monomers; grey beads ≡ hydrophilic ones) near a hydrophobic surface; b) Shearing of a granular material between two planes. A normal force (F n ) is applied and opposite planes slide in opposite directions at a fixed velocity (v x ); c) Streamlines in a packing of elongated particles (fluid modeled using a lattice Boltzmann approach). A pressure gradient is applied (P in > P out ), and periodic boundary conditions are imposed in all directions; d) Spreading of a droplet on a granular layer (fluid modeled using a lattice Boltzmann approach).
Modeling proteins or polysaccharides as amphiphilic block copolymers can provide information on how the chemical nature, distribution and interactions of monomers orchestrate their self-assembly. The simulated structures can be characterized by computing any relevant structural property available from the coordinates of the objects, such as the average distributions of core/cluster sizes (see Inset of Fig. 5), the pair distribution functions [20] or the total and partial structure factors that give access to the spatial correlations and can also be obtained from scattering experiments [27]. From these structural properties, different regimes can be determined and rationalized as a function of the parameters explored in the simulations.
For instance, using Monte Carlo simulations of multiblock copolymers (described as bead-spring chains of hydrophilic and hydrophobic monomers), we can rationalize the role of energy and entropy contributions on the equilibrium structures, in dilute and semidilute solutions or close to a surface.
In dilute solutions, we evidenced the formation of a range of intramolecular structures (characterized by their structure factors) as a function of the solvent quality and the hydrophobic-to-hydrophilic balance of the copolymers [28]. In more concentrated solutions, decreasing the quality of the solvent and increasing the concentration of polymers gives rise to different structural regimes, from a homogeneous solution to a percolating network of core-shell micelles [29] (see Fig. 5), which can be in equilibrium with a 2D percolating network of micelles when the solution is in contact with a hydrophobic surface [30] (see Fig. 4a). Figure 5: (color online) Influence of polymer volume fraction and solvent quality on the self-assembly of multiblock copolymers in semidilute solution (blue beads ≡ hydrophilic monomers, red beads ≡ hydrophobic monomers). When core-shell micelles are present (blue area), the size distribution of the hydrophobic cores shows a peak (Inset, blue curve) that is absent in homogeneous solutions (Inset, black curve). The presence of a percolating cluster of polymers (red area) is the gelation criterion. Figure 6: (color online) Simulation snapshots and partial structure factors showing the time evolution of a polymer solution under the action of enzymes (green) that convert repulsive A monomers (white) into attractive B ones (red). The system is initially homogeneous and purely repulsive (t 0 ). At long times phase separation occurs at high temperature (t H , coarsening shown by the increasing intensity at low q) while a structured gel is formed at low temperature (t L , structuring shown by the correlation peak at q p ).

Evolution at the supramolecular scale
During processing, the structure of materials may evolve due to constraints applied, such as temperature or pressure changes, chemical or en- interaction potentials governing the dynamics of the system. As an example, the case of polymer solutions subjected to the action of enzymes was studied using molecular dynamics simulations [31]. In this example, enzymes, modeled as diffusing spherical objects, convert initially repulsive monomers into attractive ones, thus leading the structure of the system to evolve. For a range of polymer densities, temperatures and enzyme concentrations, the simulations allow us to follow the fraction of converted monomers over time and the evolution of the spatial correlations by computing the structure factor of the attractive B monomers (see Fig. 6).
Dynamic quantities can also be computed that give access to the diffusion properties (e.g. mean-square displacements [25]) and aging of the system, both of which are relevant in gelling and glassy materials. applications, controlling the micro-scale fabric is a major challenge for optimizing and designing new materials: in nanoparticles, complex properties arise from grain shapes and their capacity to self-assemble; in catalysts, the specific surface and accessibility of the pore network is a major feature; in taste perception of powder-based foods, the mechanical stimuli largely depend on granulometry and apparent density. As an example, Figure 7 shows the computation by the Discrete Element Method of the evolution of the particle solid fraction of packings of dentritic particles with different shapes.

Packing and rheological properties
This kind of abacus was used for tablets production by optimizing their density [32,33].
This type of analysis was also used to investigate the rheology of powders as a function of their formulation. For example, the grading curve has a major influence on the processability of powders (flowability, reactivity, etc). The effect of the particle size distribution was studied in detail [34,35] for purely frictional powders and for cohesive powders in which attractive interactions due to capillary, van der Waals or electrostatic forces can drastically change the energy required to structure or shape the materials under shear and confining stresses [36]. For example, understanding the hydrotextural properties of particle-liquid-gas mixtures remains a topical issue of great importance for many industrial processes involving paste or aggregate formation [37,38].
Finally, living systems like many foods or cosmetics embed highly deformable particles. Hybrid numerical approaches have recently been developed to take advantage of both discrete and continuum approaches used for the contact and deformation, respectively, in the particle [39].

Fragmentation of materials
Comminution by cutting, crushing, grinding, and so on is one of the most important and energy-consuming unit operations in processing or chemical engineering. One of the key issue is to understand how the mechanical stresses are transmitted to particles in crushers and how the energy is dissipated to open and propagate cracks in complex raw materials embedding defects, inclusions or histological heterogeneities.
Different numerical methods have been employed to tackle this problem.
The lattice-based approaches as for example the Lattice Element Method [40] or the Peridynamics Method [41], are well-suited for the computation of highly heterogeneous biomaterials. Such methods are able to simulate wave propagation through elastic brittle materials and to capture the details of Just as in experimental results, structural defects are the preferential paths for cracks, so during a grinding process the spatial density of embedded defects in each particle decreases with their size. Consequently, the failure stress (yield force divided by the cross-section area) of a particle increases as the size of the particle decreases (Fig. 8b)). This type of scaling law is important to better understand the kinetics of comminution, especially when the objective is fine particle production.

Modeling multiphasic and multiscale granular food systems
Food elaboration from hydrated granular media (pasta, couscous, bulgur, rice, Ebly TM . . . ) requires the double specificity of these dispersed systems to be accounted for: (i) the solid-solid, solid-liquid, solid-liquid-gas interactions induced by the surface properties of grains (friction, capillarity. . . ) and the thermodynamic properties of fluids, but also (ii) the 'athermal' mobility governed by the rheology of granular matter and obtained by a mechanical energy input. The food, pharmaceutical and cosmetics industries, which use many wet processing operations, involve at least one elaboration step [42], and so the control of mixing, granulation or compaction processes applied on these media remains a key challenge for optimum realization of the products.
It is important to compile all the best initial component abilities and the optimal transformation processing capabilities for quality criteria linked to texture (hardness, heat transfer capacity, release of active molecules. . . ) or to the induced rheological behavior (yield strength, flow properties) [43]. In particular, these objectives require mastering of the expression of capillary forces within the granular matrices during their hydration (agglomeration, shaping, texturization) or dehydration (drying, shrinkage) by promoting the meeting between grains and thus their mobility.
From a process point of view, i.e. at pilot scale, the technological implementation of these systems often uses dimensional analysis with operating and block diagrams [7]. Such diagrams allow the thermodynamic states of a system to be localized during its evolution under thermo-hydro-mechanical loads. They constitute an operating aid for the elaboration step by driving processes, the optimization of the formulation, and/or energy regulation.
This approach, which is well developed in chemical and product engineering, ing, drying, osmotic dehydration. . . ) loads contributes to the formation of a granular matrix. It is therefore relevant to achieve a quantitative description of these systems that integrates multidimensional and multifactorial aspects, including grain structure, intergranular arrangement, fluid phase distribution, mixing rheology, evolution with applied loads, etc. These multiple physical states can be charted on a single phase diagram called a hydrotextural diagram (Fig. 9).
This diagram, which represents phases at a representative elementary volume scale, is comparable to diagrams that describe the "jamming" phase in  soft-matter systems [46]. The hydrotextural diagram is usually represented with water content (w) and compactness (Φ) axes [37,47], and an additional axis which describes the applied thermo-hydro-mechanic load. It is limited on the upper part by the liquid saturation curve and by the maximal and minimal compactness of the granular medium (resp. close random and loose random packing). We show that the respective connected threshold of each phase is essential for understanding the transitions between the differ- ship between liquid content, degree of saturation and compactness [48]. It is therefore possible to draw the transformation path of a product during its elaboration. This point is illustrated in paragraph 3.3, which presents an example dedicated to humid agglomeration.

Mobilities promoted by a blade in an ensiled granular medium
Most implementation processes for powders provide their motion by an external energy input. Particle flow is quite well described for mechanical tests such as in simple shear, inclined planes or rotating drum configurations [49,50]. Inertial and Froude numbers make it possible to represent the flow typologies (i) in the quasi-static, frictional and collisional regimes at the grain layer motion scale and (ii) in the slumping, rolling, cascading, cataracting and centrifuging regimes at reactor scale [49], respectively.
These phenomenological descriptions, based on physical experimentations, have however been less investigated in the case of particle flow in semiconfined geometry and promoted by a blade or an intruder [51,52]. These flows, usually observed in food engineering during elaboration in mixers, remain relatively uncontrolled. As a rule, the humid agglomeration operations, and more globally food powder hydration are still the limiting steps of granular food elaboration industries.
The experimental characterization of these products is in our case achieved by joint measurement of (i) the force exerted on a blade by a laterally confined granular bed (Fig. 10 a)  tion inside the bed [52], which can be modeled by the Janssen model [53] if the granular bed is structured by successive layers and, therefore, if the angle of repose of the granular medium is low enough [54]. Such analysis highlights the existence of two characteristic lengths: λ M which corresponds to the lateral percolation length of the intergranular force network until the cell walls, and λ B which corresponds to the force screening by the bottom of the cell (Fig. 10b)). When the system is set in motion by a blade (i.e. to deduce the associated flows (Fig. 10c)). Close to the bottom of the cell and just above the blade, the mobility of the powder grains is reduced and homogeneous due to powder compaction and void emergence just below the blade. Under these conditions, the flow regime is almost quasi-static. In the central area, grain mobility is heterogeneous and spatially limited, and the flow regime is rather frictional [55]. In this area, grain mobility is characterized by punctual grain micro-avalanches on both sides of the blade. Close to the bed surface, the stress relaxation induces a better homogeneity of the flow, and so this last zone is adapted for mixing. It is defined by the characteristic length λ M , which represents a rational tool for reactor design [53].
Spatio-temporal correlation analysis of the PIV images, which is currently performed, will more precisely qualify particle mobility, and probably also help design and drive mixing processes.

Humid agglomeration: example of couscous formation
The particle agglomeration process allows composite structures to be designed from smaller individual element assemblies that are brought into contact and stabilized by attractive interactions [56]. This type of elaboration process, called « bottom-up », structures the matter from the small-est to largest sizes. It is radically different from the « downsizing » process type where the final product is obtained by a fragmentation of larger dough pieces either by erosion and breakage [57] or by local deagglomeration [58].  . 11). Whatever the mixing speed, granular media type and physicochemical properties of the binder [59,60], we observe that the agglomeration process corresponds to an expansion of the agglomerates, i.e. an increase of porosity. One of the outstanding features of powder agglomeration is the nucleation/growth process [61]. We show that in the case of a low-shear mixer, the nucleation/growth process follows a fractal structuration [59], where the association mechanism is mainly due to the hydrotextural state evolution of agglomerates [62]. This morphogenesis process highlights the link between the local arrangement of phases and the emergence of the characteristic size of the product in this free shape elaboration process.
We believe that better control of particle flow inside reactors and a unified framework of hydrotextural parameters for granular media make it possible to build operating diagrams that will help innovate or improve product elaboration with a high yield and a high degree of energy efficiency.

Interactive modeling for multiscale food systems
When dealing with meaningful representations of multiscale food systems, several important issues have to be considered: data can be plagued by uncertainty, particularly when chemical, physical, and biological phenomena concur to define the process; heterogeneity of available information is also likely, as a vegetable involved in a process can be characterized by more than 40,000 genes, whereas the quality of the final product can be assessed using just a few sensory features; qualitative and quantitative information, from expert perception of food quality, to nano-properties of ingredients, may also coexist in the same process. Consequently, effective models have to account for variance, manage heterogeneous data, and be able to include both qualitative and quantitative values.
Moreover, as gathering data on bioresources and bioproducts is an expensive and time-consuming process, available datasets are often sparse and incomplete, which poses a challenge to both human modeling practitioners and machine learning algorithms. In order to obtain reliable models, it thus becomes necessary to acquire additional information from external sources.
Experts in a specific domain can provide invaluable insight into products and processes, but this precious knowledge is often available only in the form of intuition and non-coded expertise. Including expert insight in a model is not a straightforward process, but it can effectively be tackled by having humans interacting with a machine learning process, through visualization or via specialists in encoding implicit domain knowledge [63]. In the following, three selected case studies portray different ways of combining machine learning with expert interaction.

Dynamic Bayesian network model for Camembert ripening
The apparent simplicity of everyday food processes often hides complex systems, where physical, chemical and living-organism processes co-exist and interact to create the final product. Cheese ripening is a major example of a process that human practitioners can achieve with success but for which several scientific details remain poorly understood. Nevertheless, even for these processes it is possible to create effective models by harnessing knowledge from experts in the domain and coupling it with experimental data by using an appropriate machine learning framework able to take into account such heterogeneous information. The work presented in [11] shows how the methodology described can be applied to the case of Camembert, a popular French cheese, creating a model that goes from micro-scale properties such as concentration of lactose and bacteria, to macro-scale properties such as color and consistency of the crust.
The approach used in this experiment is a dynamic Bayesian network (DBN) [64], a variation on a classical Bayesian network [65]. A DBN is a graph-based model of a joint multivariate probability distribution that captures properties of conditional independence between variables; in the graph, nodes X i (t), i = 1, ..., N , represent random variables, indexed by time t, providing a compact representation of the joint probability distribution P for a finite time interval [1, τ ] defined as follows: where X(t) = X 1 (t), ..., X N (t), is called a "slice" and represents the set of all variables indexed by the same time t. P a(X i )(t) denotes the parents of Xi(t). P (X i (t)|P a(X i )(t)) denotes the conditional probability function associated with the random variable X i (t) given P a(X i )(t).   Figure 12: Final DBN model for the Camembert cheese ripening process. The part denominated M1 represents the variables taken mainly from experimental data, whereas part M2 represents variables derived from expert knowledge and assessment. Grey nodes represent constraints defined by experts. Figure 12 shows the resulting model. Figure 13 presents an example of predictions of the dynamics in the process. From the example, it is noticeable how the model is able to satisfyingly reproduce the dynamics of variables tied to microbial growth, substrate consumption, and sensory properties, for different temperature conditions.

Decision support system for grape maturity prediction
Predicting the right moment to harvest grapes intended for wine production is a task that traditionally is left to specialists in the field. Still, as repercussions of climate change make local weather more unpredictable, experts can use machine learning techniques as a decision support tool, helping them to deal with modified conditions. As the human knowledge gained over years of wine production is invaluable and often includes conditions that have not been measured in recent times, it is only sensible to include it as much as possible in the target framework. The work presented in [67]    from acidification activity to viability [68]. From a vast number of possible dependencies between the measured variables, an automatic methodology can identify the most relevant, and combine them to obtain a global model. it is always possible to find a polynomial equation that perfectly fits the data points, for example with a complex equation featuring as many parameters as data points available but such an equation could overfit the dataset, fail to represent the underlying relationship between the variables, and ultimately poorly predict the unseen data.
To avoid this issue, every variable in LIDeoGraM is associated with a set of candidate equations, obtained through symbolic regression [69]. Eureqa 2 [70], a commercial software specialized in symbolic regression, is able to obtain a set of possible equations for every variable in a given dataset. variable. This view enables users to rapidly assess which variables in the global model are poorly predicted, but also which ones may be responsible for the poor predictions of their dependent nodes. Some results obtained for the previously described dataset [68] are presented in Figure 15.
LIDeoGraM has several ways to add expert knowledge. First, it is possible to attribute a category to each variable, and specify the available dependencies between categories for the symbolic regression. A category of nodes can represent a step in the process, or a scale of information. After obtaining a set of equations for every node, experts can then filter it by specifying that certain kinds of node-node dependencies are impossible. Experts can then manually add new equations in the set of candidate local models for a node, and eventually restart the search for a global model after putting all their constraints in place. With LIDeOGraM, it is possible to learn global models for the production and stabilization of bacteria. Such models can then be used to better understand how to preserve the quality of the culture during the process, foster the emergence of new hypotheses, and design new experiments, whose data could in turn be used to further improve the global model.

Conclusion
This article introduced methods and case studies intended to give insight into the abilities provided by modeling approaches in the context of bioresources and bioproducts. Choosing the appropriate method with respect to the length scale and question of interest is a crucial issue. For prediction and optimization purposes, it can be very efficient to use machine learning methods: by combining the use of experimental data and expert knowledge, they are able to account for a large number of factors affecting processes and for a number of context-relevant length scales. Contrary to these methods suited for prediction and optimization purposes, the objective of physical and mechanical modeling is to reveal the mechanisms underlying the evolution of biobased systems during processing at a given length scale. Here again, several methods should be combined to span the range of relevant length scales.
The case studies presented show that modeling can usefully help to control processes and the features of the resulting materials but also to propose novel paths for forming and processing materials. For instance understanding the phase transitions in granular media or revealing the dynamics of structuring of polymeric materials can serve as levers for designing original biobased materials with controlled properties, and for inventing new ways of processing bioresources.

Highlights
• Modeling approaches help understand properties of bioresources and bioproducts.
• Modeling can account for multi-scale phenomena, heterogeneity and variability.
• Physical modeling gives access to collective properties of soft-matter and granular systems.
• Interactive machine-learning approaches can help model complex food systems.