Toward High Fidelity Materials Property Prediction from Multiscale Modeling and Simulation

The current approach to materials discovery and design remains dominated by experimental testing, frequently based on little more than trial and error. With the advent of ever more powerful computers, rapid, reliable, and reproducible computer simulations are beginning to represent a feasible alternative. As high performance computing reaches the exascale, exploiting the resources efficiently presents interesting challenges and opportunities. Multiscale modeling and simulation of materials are extremely promising candidates for exploiting these resources based on the assumption of a separation of scales in the architectures of nanomaterials. Examples of hierarchical and concurrent multiscale approaches are presented which benefit from the weak scaling of monolithic applications, thereby efficiently exploiting large scale computational resources. Several multiscale techniques, incorporating the electronic to the continuum scale, which can be applied to the efficient design of a range of nanocomposites, are discussed. Then the work on the development of a software toolkit designed to provide verification, validation, and uncertainty quantification to support actionable prediction from such calculations is discussed.


Introduction
The properties of any complex material are typically dependent on interactions and processes occurring across a large range of time and and length scales, from the chemical bonding at the scale of electronic structure to the macroscopic level stresses experienced during the material's use. Although these interactions essentially act across a continuum, it is conceptually useful to think in terms of a set of distinct scales, each capturing a DOI: 10.1002/adts.201900122 process that occurs at that characteristic length and time, and to consider the information that is passed "upward" or "downward" to adjacent scales. This approach is essential to make the computational prediction of a material's properties tractable, owing to the multiple orders of magnitude difference in the time and length scales of important processes that need to captured.
Experimentally, the design and manufacture of advanced multi-functional materials remain a slow, uncertain, expensive, and time-consuming process. [1,2] It can take 20 or more years to move a material from initial discovery to the market; [3] by offering an opportunity to access a wide range of materials configurations rapidly, computational models offer a serious alternative to the traditional experimental trial-and-error process currently used in industry [4,5] and will surely increase in importance as computational power increases.
Indeed, with the increasing prevalence of ever larger and more powerful computational facilities across the globe, computational materials research stands to make significant advances through exploitation of High Performance Computing (HPC). However, this increase in computational power is not coming primarily from faster processors but rather from more efficient computer infrastructures capable of coordinating a larger number of these processors, in a parallel or distributed fashion. Choosing and developing algorithms with optimal scaling capabilities on such infrastructures is key to their efficient exploitation, but a monolithic approach will sooner or later become impractical, either limited by inter-core communication costs (in the case of strong scaling, with a fixed problem size) or unreachable characteristic time scales (in the case of weak scaling, with a proportionally increasing problem size). [6] In this Progress Report, we describe procedures to exploit these advances in computational power to predict the properties of materials. In particular, we focus on multiscale workflows consisting of several computational solvers, each capturing a different aspect of the material, and what information must be passed to couple these solvers together. Such a workflow may be used to implement a multiphysics application, in which each model simulates a different physical process occurring within the system [7,8] (e.g., thermomechanics and fluid-structure interaction), or a domain decomposition, [9][10][11] in which the system is split into individual spatial subdomains which interact continuously with their neighboring subdomains. In both approaches, different models or subdomains may be run on different computational resources or on different parts of an emerging exascale machine, providing that the communication with other models does not become a bottleneck.
In the sense we use the term here, multiscale simulations refer to multi-model workflows, [12][13][14] for which we exploit separation of time and spatial scales to allow the dynamics of different processes to be simulated largely independently, with heavily reduced communication between models. For example, we may exploit the fact that the characteristic time and length of processes in the nanostructure and macrostructure of a given material are sufficiently different that their respective dynamics may be simulated separately.
For hierarchical multiscale approaches in the simplest multiscale workflows, the scales are executed sequentially, each using the output from the previous scale as input to the current one. For example, simulations at the lower scale model uses a single initial configuration (boundary condition, microstructure) from the upper scale, [15] or alternatively, the upper scale model uses initial parameters computed at the lower scale. We term this coupling topology "acyclic." On the other hand, "cyclic" coupling requires a continuous back and forth exchange of information between the scales, frequently referred to as a concurrent multiscale approach. [16][17][18] For example, at each cycle, a simulation of the lower scale model is performed for each unique configuration found at the upper scale, increasing significantly the demand for independent simulations of the lower scale model. In this case, we term the coupling topology as "cyclic" with "dynamic instances".
Our interest lies in the use of multiscale modeling to determine how the micro-mechanical behavior of constituent materials determines the properties of the overall material, particularly for nano-composites. A nano-composite is defined by IUPAC as a multiphase solid material where one of the phases has one, two, or three dimensions measuring less than 100 nm; this phase is commonly referred to as a "nano-particle." Nanocomposites differ from conventional composite materials due to the exceptionally high surface to volume ratio of the reinforcing phase and/or its exceptionally high aspect ratio. As a result of the large interfacial area, nanoparticles can provide significant reinforcement through the addition of small volume fractions (typically less than 5%). [19,20] However, if the extensive surface area of the nanoparticle is not exposed, for example, due to aggregation, the reinforcement effect is diminished. The assembly and dispersion of the nanoparticles is controlled by their shape, interfacial behavior, and the properties of the surrounding matrix. Careful tuning of these factors may produce nanocomposites that facilitate dispersion, forming materials with improved properties, such as being lightweight yet strong. There is a large market for such materials; for example, advanced aircraft in service today use composite structures rather than metal: the Boeing 787 Dreamliner is 80% composite by volume and each 787 aircraft contains approximately 35 metric tons of carbon fibre reinforced plastic. Further research into nanocomposites could produce even greater weight savings. O'Donnell et al. conducted a mass-analysis on a notional carbon-nanotube reinforced polymer-structured aircraft. [21] Each airframe modeled saw a 17.32% weight reduction and average

Maxime Vassaux is a Research
Associate at the Centre for Computational Science at UCL developing multiscale simulation methods applicable in material science. His background in computational mechanics helped him tackle problems in a broad range of domains from stem cell mechanics in Aix-Marseille University to concrete nanocomposites fracture at Ecole Normale Superieure de Cachan. He now focuses on an intersection of these different fields designing new biomaterials for improved tissue engineering. fuel savings of over 10%. There is a clear potential here for computational modeling to discover new composite materials, but effective integration within existing design processes will require computational models to be certified via appropriate validation and verification processes, and their predictions rendered actionable via uncertainty quantification. There are a number of challenges with multiscale simulations. The coupling between single-scale codes is not always straightforward and can require complex transformations to generate usable information for other scales in the workflow. These scalebridging transformations often contain assumptions that are not valid in every instance, limiting the transferability or applicability of the overall workflow. Similarly, the assumption of scale separation is not always valid; for example, when nanoparticles have one very small dimension and one very large dimension, and both cannot be accurately captured in the same single-scale model. When information is exchanged between models uncertainties can be propagated which may affect the accuracy or stability of the multiscale simulation. These challenges may be domainspecific, scale-specific, or reflective of information exchange of multi-models workflows in general. Again, rigorous analysis is required to understand errors in multiscale simulations and is discussed in this Progress Report.

Robert C. Sinclair
Our research in computational materials modeling of nano-materials is built upon three multiscale approaches (see Figure 1). We present each approach and a related application in the following sections. First, the interactions of graphene with itself are determined by bridging from quantum mechanics, via density functional theory, to all-atom molecular dynamics (MD). Second, large-scale exfoliation dynamics of clay in polyethylene nanocomposites is analyzed through bridging from all-atom to coarse-grained MD. Finally, the mechanical reinforcement of epoxy resins using graphene sheets is explored via bridging from all-atom MD to continuum mechanics. The two first applications employ acyclic, hierarchical multiscale approaches, while the third employs a cyclic, concurrent multiscale approach. Figure 1 illustrates the coupling topologies between the single-scale models and their associated length-and time-scales. In each case, we describe the domain-specific challenges in coupling the two levels of resolution, both theoretical and practical, and opportunities resulting from extending modeling of these materials to larger scales, without encountering prohibitive computational cost. We conclude with a discussion on validation, verification and uncertainty quantification of such multiscale applications, and how this can efficiently exploit emerging massively parallel computing resources.

From Density Functional Theory to All-Atom Molecular Dynamics
MD simulations require forcefields to describe the interactions between particles so as to capture the desired physics. The functional form, and associated parameters that describe all the interactions including bonded and non-bonded interactions are collectively referred to as the forcefield. To validate a forcefield, researchers would like to refer to experimental values; but experimentally accessible values (such as density, solvation energy and diffusion coefficients) are, in almost all cases, a function of several forcefield parameters. This means there are often degenerate sets of parameters that fit the experimental data equally well. As a result, fitting a forcefield using only experimental data is very challenging. It is therefore common to rely heavily on theories at lower scales to design and fit the parameters in the functional form of a forcefield. Where possible, ab initio techniques are used to provide this data. In the following section of this progress report we show that, by using information at the electronic structure level to overcome deficiencies in traditional forcefields, we can parameterise additional forcefield terms that substantially alter behavior on much larger scales, more closely matching behavior seen in experiment.
Lennard-Jones (LJ) potentials are ubiquitous in MD forcefields, where they describe the dispersion forces and hardcore repulsion between non-bonded particles. This type of potential can fail to capture the behavior correctly, for example in describing graphene sheets during exfoliation. [22] The spherical symmetry of typical dispersion potentials means that for flat, regular structures like graphene, there is very little variation in the interaction potential when two sheets slide over each other. This results in negligible friction between sheets. Experimental observations have shown this to be incorrect; for example, for a 10 nm graphene sheet sliding around on a graphite substrate with only thermal energy at 5 K. [23] Simple Lennard-Jones forcefields are therefore inappropriate for simulations involving graphene friction, exfoliation, intercalation, and aggregation, which depend on sheets sliding past each other. This has not prevented their widespread use in the literature. [24][25][26] We conducted a review of the data that could be used to parameterise a forcefield for graphene-graphene interactions. [22] Many experiments have reported the absorption energy of graphene, but their values vary wildly: from −20 to −66 meV per atom, [27][28][29] and as such cannot be used as a starting point for forcefield parameterization. Density functional theory (DFT) may be used instead. We used two functionals recommended by Wang et al. [30] that gave values of −50.8 and 61.7 meV per atom for the adsorption energy of graphene. The choice of DFT functional is not a trivial one; as shown by Wang et al. [30] some common functionals predict next to no adsorption energy for graphene sheets.
While optimizing the parameters of Lennard-Jones forcefields may result in a satisfactory adsorption energy, such forcefields fail to correctly describe dynamics. When propelled in any given direction, graphene sheets can slide in a zigzag fashion so that they only encounter an energy barrier of 0.8 meV per atom. To Figure 2. A representative trajectory from a large of ensemble of simulations of a 10 nm graphene flake on a graphite surface after being pushed out of a commensurate position. In this instance, the trajectory lasted 500 ps until the flake was stationary. The flake travels from left to right, with color corresponding to time (red at the start, blue at the end). A snapshot of the flake is shown every 50 ps. The flake slides and rotates freely when unaligned with the surface lattice and is only deflected when it is aligned. Reproduced under the terms of the CC-BY 4.0 licence. [22] Copyright 2018, Wiley-VCH.
test the existing forcefields, we compared with the energy barrier to sliding given by Gao and Tkatchenko. [31] This is a small but significant energy barrier; many of the tested MD forcefields, however, have barriers which are far lower, accounting for the lack of friction when simulated graphene sheets slide over each other. In short, LJ type functions either overestimate the adsorption energy (such as with AMBER [32] ) or underestimate the friction felt between sheets (OPLS [33] and COMPASS [34] ). We designed a forcefield, GraFF, [22] to address this problem. GraFF is a three-body potential that adds an energy penalty to atoms in adjacent sheets that are not directly overlapping (1): where r C 1 C 2 is the distance between graphitic carbon atoms, is the angle between C 2 , C 1 , and another carbon atom bonded to C 1 . V 12−6 LJ (r) is a standard LJ potential: Fitting forcefields to such static calculations is insufficient; they must also be validated based on dynamical studies. A study by Feng et al. [23] showed graphene flakes sliding large distances on a graphite substrate after being given only a small impetus by a scanning tunneling microscopy (STM) tip, a behavior known as suberlubricity. Mimicking this with MD (see Figure 2), using GraFF to describe the graphene-graphene interactions, recreated the same observations seen by Feng et al. [23] where the preexisting LJ forcefields failed. STM can not resolve the motion of flakes after being pushed, whereas using MD, we were able to describe their motion and account for the mechanisms of friction on the nanoscale.
Propelling flakes in this way is a chaotic process in the technical sense and ensembles of 40 replicas were used to get reliable statistics on the process and compare with Feng's experiment. Our results compare very well with those in experiment, recreating the distance the flake traveled and its temperature dependence. The effective friction was caused by fleeting alignments between the flake and the substrate lattice.
GraFF now permits the simulation of graphene in scenarios that were previously inaccessible with MD. Exfoliating graphene on a large scale still poses a huge barrier to the production of polymer nanocomposites. We expect simulation to provide insight into the synthesis and production process. Simulating micromechanical exfoliation of graphene nano-flakes with MD has shown that there is much to learn about even the simple routes to graphene production.
To study the material reinforcement properties of graphene, as well as its dispersion and aggregation, methods capable of dealing with longer time and length scales are necessary. In the next section, we illustrate how this is possible with multiscale simulation, using the example of the dispersion of 2D clay mineral nano-flakes.

From All-Atom to Coarse-Grained Molecular Dynamics
In this section, we look at recent progress in simulating the behavior of nano-composites through coupling all-atom (AA) molecular dynamics to coarse-grained (CG) MD for 2D nanoflakes. CG-MD involves performing molecular dynamics with a reduced representation of the atomistic system: a single bead (or pseudoatom) often represents many atoms. This reduction in the degrees of freedom can lead to computational speed-ups of 10-100 times over AA-MD, [35] while retaining chemical specificity. We will also discuss the challenges in coupling AA and CG levels of representations and, finally, the opportunities for increasing the accuracy and applicability of AA/CG coupled simulations.
In a nanocomposite material, the surface area to volume ratio is higher for 2D layered nanoparticles compared to 1D structures such as carbon nanotubes. As a result, there has been much interest in producing composites containing highly dispersed nanoscaled sheets, such as graphene (the sheets that form carbon nanotubes) and naturally occurring minerals such as clays. The degree of aggregation of the sheets will be a function of the thermodynamics of the system, while processing conditions often cause the system to remain in kinetically trapped meta-stable states. A multiscale combination of AA and CG-MD can simulate the thermodynamic tendency of the nanoparticles, such as graphene and clays, to aggregate or disperse, and hence can be used to make predictions of the performance of nanocomposites.
One area of our research has focused on the use of atomistic MD to study clay mineral surfaces interacting with synthetic thermoplastic polymers and biomolecules. The length and timescales of AA-MD are ideal for studying the behavior of polymers on the surface of clay. We have used AA-MD to address the rational design of clay-swelling inhibitors for use in the oil industry, [5] which in turn led to the development of new biodegradable inhibitors. We also gained insight into the structure, conformation and stability of nucleic acids when interacting with clay surfaces (for applications in drug delivery and origins of life research). [36][37][38] While AA-MD is now a routine tool for understanding the interfacial behavior of clay surfaces, it soon became clear that to simulate the large-scale behavior of clay sheets, including their aggregation or dispersion within thermoplastic polymers, we needed to extend our simulations to much larger time and length scales. Clay sheets typically have lateral dimensions of 10 to 1000 nm, well beyond the practical limits of AA-MD. To that end, we now routinely use multiscale methods to model chemically specific combinations of clay and polymers, allowing predictions of the materials properties of clay-polymer nanocomposites for clay platelet sizes approaching those found in nature (diameter > 10 nm) at low clay volume fractions (5%), through coupling AA-MD with CG-MD.
Clays, such as montmorillonite (MMT), consist of nanometer scale sheets of magnesium aluminium silicate (see Figure 3). The sheets are stacked into platelets (often named tactoids) composed from a few to hundreds of individual sheets. The sheets possess dimensions of 1 nm thickness and range from 10 to 500 nm in diameter, which results in a very high aspect ratio. On the level of an individual platelet, the clay layers are very stiff: transfer of stress from the thermoplastic polymer to the mineral-in part due to the large interfacial area between the clay and polymerresults in increased mechanical strength, while the flexibility of the polymer prevents the composite from being brittle. [20] Our multiscale AA/CG-MD approach has enabled us to predict the melt intercalation behavior and final morphologies of MMT clay-polyvinyl-alcohol (PVA) and MMT clay-polyethylene-glycol (PEG) systems. [39] The CG representation of the clay-polymer system consists of CG pseudo-atoms, defined as the center of mass of 7-10 atoms of the polymer and single unit cells of the clay framework (with different pseudo-atoms types representing charged, non-charged and edge clay sites), see Figure 3. The CG interaction parameters were generated by matching to structural properties of the AA-MD system, calculated from numerous small AA-MD simulations. These properties included radial distribution functions (RDF) or density profiles perpendicular to the clay sheet (using Iterative Boltzmann Inversion IBI, discussed below) and through potential of mean force (PMF) calculations when an unbiased AA-CG simulation did not sample enough configuration space to define a potential using IBI. To examine intercalation and subsequent exfoliation of the clay by the surrounding polymer, our initial system consisted of eight tactoids of four sheets, of approximate diameter 10 nm, dispersed in a polymer melt simulation cell of 40.2 nm × 40.2 nm × 24.0 nm (see Figure 3). The CG system was run at elevated temperatures (500 K) and pressure (100 atms) of melt-processing. See ref. [39] for the technical details of the simulations. The longer timescales of CG-MD allowed us to observe the dynamical process of polymer intercalation into pristine clay tactoids over timescales above 100 ns and the ensuing aggregation of polymer-entangled tactoids into larger structures (over timescales of 500 ns). We concluded that the intercalation of molten polymers occurs only when there are favorable interactions with the clay surface (in the case of PVA, hydrogen bonding to the hydrophilic clay surface), otherwise the tactoids remain un-intercalated and self-assemble into larger, unintercalated tactoids.
Furthermore, through these large-scale CG simulations, we can probe the elastic properties of the self assembled claypolymer system. Suter et al. [39] examined the changes in elastic response to strain (i.e., the Young's modulus) for the intercalated PVA-clay composite and the un-intercalated PEG-clay composites. Coarse-grained interaction potentials generated through structural matching are not expected to reproduce elastic properties, so the resulting stress-strain plots are qualitative, but they clearly show that the stress transfer from polymer to the much stiffer clay is significantly less effective for unintercalated clay-platelets. We found that the in-plane Young's modulus increased from 0.372 GPa for neat polymer to 0.637 GPa for the intercalated PVA-clay system, but only increased The initial structure used in the uniaxial tension and compression calculations for the clay-PVA system displayed; the x-direction is along the horizontal axis, and z-direction along the vertical axis. Note that the platelets are mainly oriented in the xy plane. b) The corresponding stress-strain curves in the x (black)-and z-directions (cyan) for the system shown in (a). The neat polymer stress-strain curve is also shown (red). c) The initial structure for the uniaxial tension and compression calculations for the clay-PEG nanocomposite, also shown along the x-direction (horizontal axis) and the z-direction (vertical axis). d) The stress-strain curves for the PEG-clay nanocomposite shown in (c). Reproduced under the terms of the CC-BY 4.0 licence. [39] Copyright 2015, Wiley-VCH. from 0.127 to 0.137 GPa for non-intercalated PEG-clay (see Figure 4).
These CG simulations showed that, while controlling the dispersion of clay nanoparticles into polymer matrices is a challenge, it is required to achieve the pronounced property improvements theoretically promised by polymer nanocomposites. Favorable interactions between the clay surface and polymer were necessary to induce intercalation; for organophilic polymers the nanoparticles are typically immiscible with the organic phase and materials properties are much reduced as the clay sheets do not disperse. Experimentally, one approach to overcoming this difficulty is to modify the clay surface by grafting it with organophilic chains. This increases the attraction of the clay surface to the polymers, and pillars open the clay layers to allow access to the clay surface. [20] In our recent research, using CG-MD, we showed that by increasing the density of the surfactants on the clay surface, it was possible to make the dispersed morphology the most stable, and hence increase the final properties of the composite material. [40] The organophilic surfactants are based on a PEG backbone, with a quaternary ammonium head-group. Before mixing with polymers, the surfactants increase the separation of the clay layers to approximately 2.0-2.8 nm (depend-ing on the density of surfactants), and when mixed with PEG polymers-which did not intercalate with the pristine clay, as we showed in Suter et al. [39] -they fully extend and allow intercalation. With sufficient surfactant density, the intercalated PEG-clay system, now at a clay-layer separation of approximately 3.2 nm, eventually exfoliates due to thermal motion (see Figure 5 for a snapshot from simulation).
However, when clay nano-particles are mixed with highly hydrophobic polymers, such as polyethylene, the clay-tactoids remain stubbornly aggregated despite surfactant treatment. To overcome these residual adhesive forces, large processing forces are often used to break apart the clay platelets (through shear, for example). We have used coarse-grained simulations of polymerclay systems under shear forces to understand the mechanisms involved in breaking these adhesive forces. [41] To compute the stress required to exfoliate a clay sheet from a tactoid, we constructed a two-layer clay system, with clay lateral dimensions of 14 nm × 10 nm, immersed in a simulation box of molten polymers of size 20 nm × 20 nm × 20 nm. The CG simulations consisted of a variety of hydrophobic and hydrophilic polymers and surfactants, with simulation containing approximately 77 000 CG atoms, equivalent to 554 000 atoms. We found that, with The PEG polymer and surfactant molecules have been removed to aid visualization. The high-density surfactant system has fully dispersed. Reproduced under the terms of the CC-BY licence. [40] Copyright 2015, American Chemical Society.
hydrophobic polymer (PE) and chemically identical surfactants, there is no thermodynamic driving force for the polymer to intercalate, and the un-intercalated tactoid requires large shear stresses to overcome the adhesive force generated by the interacting surfactant molecules from both surfaces. For hydrophilic polymer and surfactant, the clay-layers expand and intercalation is promoted. The application of shear force can exfoliate the clay layer with less shear force required than for the un-intercalated clay-tactoid. This shear force is, however, larger than for noninteracting clay sheets. These simulations therefore demonstrate that, while surfactants promote the initial intercalation of polymer into the clay gallery, it is their interactions that also resist the shear-induced exfoliation.

Challenges and Opportunities
The investigations described above are only accessible due to the longer time and length scales possible with CG-MD compared to AA-MD. They have allowed us to probe mechanisms on micrometer length scales with resolution on Å length scales. To enable these large and accurate CG-MD production simulations using a AA/CG multiscale approach, we mapped the lower to the higher level of description in a systematic manner using the IBI method; however, depending on the system or property under consideration, these methods will not always reproduce the behavior of the underlying atomistic system. In the following, we describe the challenges in applying such AA/CG multiscale approach routinely; we have categorized these difficulties as a "representability problem," a "transferability problem" and an "error propagation problem." Hitherto, primacy has been placed on structural features, meaning that particular molecular properties are preserved in moving between the AA and CG descriptions, based on the Iterative Boltzmann Inversion (IBI) procedure, or similar iterative approaches. [42,43] In IBI, a numerical coarse-grained pairwise potential U(r, i) is iteratively refined until a target structure is reproduced within a predefined error: Here U(r, i + 1) is the new CG potential, U(r, i) the potential used in the previous iteration i, k B the Boltzmann constant, T the temperature, g(r, i) the RDF obtained from the CG run and g (r) the atomistic target RDF. This ensures that the free energy of the reduced variables is matched. However, while the IBI procedure can reproduce structural details at the molecular level, still unknown is the extent to which we can also match to response functions such as heat capacity, thermal expansion and isothermal compressibility, which are related to equilibrium fluctuations (the so-called "representability problem"). These functions are not expected to be well represented in a CG model due to the smoothing of the potential energy surface. However, thermodynamic properties and non-equilibrium properties are required to fully exploit the predictive capability of AA/CG multiscale simulation. As the whole system (thermodynamic and structural) is effectively included when performing IBI-as we are matching the free energy-the pair potentials optimized at one set of conditions will not generally be transferable to another (the "transferability problem"). [44] We have discussed above the use of CG-MD to determine materials properties as a function of nano-particle dispersion; however, it should also be noted that CG-MD can be used to examine many other thermodynamic properties of nano-composite materials, with varying degrees of accuracy. The glass transition temperature (T g ) and thermal expansion coefficient of a polymer bulk or polymer nano-composite can be estimated from the variation of inverse density (specific volume) with temperature. The specific volume is linear with decreasing temperature, until a change in the slope at T g (due to the change in the thermal expansion coefficient between the glass and rubber phases). To use CG-MD to determine T g , therefore, requires the CG-MD potentials to be transferable between different temperatures. An example illustrating how the "transferability problem" can affect these properties is shown by Carbone et al., [45] where they found that T g and thermal expansion coefficient of a CG-MD model of polyamide-6,6 agreed with experiment, but for a CG-MD polystyrene model with interaction parameters computed at 500 K, the dependence of density on temperature was unphysical. This was attributed to the phenyl ring reorientation that assumes more importance as the temperature decreases, and is not adequately captured by CG-MD parameterization at high temperature. For further discussion on the transferability of CG-MD potentials for polystyrene, see the review by Karimi-Varzaneh. [46] One avenue to improve AA/CG coupling is to use machinelearning techniques to perform general mappings, computing the landscape of possible properties at the lower resolution level by varying the features of interest. [47][48][49] Multi-objective cost functions are being investigated to investigate whether we can locate the optimal mapping. For example, Moore et al. [50] have devised a Multistate IBI method, where multiple thermodynamic states are used in the cost function.
Furthermore, when we coarse-grain, we eliminate significant components of entropy and dissipation. Where pair potentials cannot be optimized to give good enough agreement, it is possible to add frictional terms to perform Langevin dynamics, in an attempt to rescale the dynamics, but doing so in a consistent way remains a challenge. [51] Coarse-grained potentials are no more accurate than the underlying atomistic simulations they are matched to (the "error propagation problem"). For example, forcefields for organic molecules and mineral solids have to be combined, normally through heuristic combinations such as the Lorentz-Berthelot rules for Lennard-Jones parameters ( ij = ( ii + jj )∕2 and ij = ( ii jj )∕2), for atoms i and j. This is somewhat unsatisfactory as the chemical peculiarities of the non-bonded interactions such as the specific mutual polarization of atoms i and j are not fully taken into account. The individual forcefields have been parameterized to reproduce the properties of each phase individually, so there is no advantage to altering these. One approach to overcoming this difficulty could be to define directly the non-bonded interaction parameters between atoms of type i and j (where i is in the organic phase and j is in the mineral solid) through matching to high-quality adsorption energies, calculated from quantum mechanical simulations (such as DFT) [52] that include recently developed empirical van der Waals dispersion terms. Of course, such accuracy comes at considerable computational cost, as it requires numerous high accuracy DFT simulations. In the case of clay-polymer interactions, it would require the calculation of the adsorption energies of numerous small representative organic fragment molecules corresponding to parts of the polymer, on the clay mineral surface. The increase in accuracy may not be sufficient for the increase in time and computational cost required for the DFT simulations; it is an area of research to investigate where traditional combination rules for non-bonded interactions fail.
One of the reasons atomistic forcefield approaches in AA molecular dynamics have been successful is that, in general, the combination rules are flexible. Once we have parameterized for representative chemical environments (known as an "atom type"), these parameterizations can be reused for a huge variety of environments. If we want to use high throughput computing for coarse-grained simulation, that is, running many hundreds or thousands of automated simulations of differing chemical composition, we would need such an approach. However, we need to overcome one of the "transferability problems" described above. Along with the thermodynamic state-point dependence mentioned earlier, ideally, one would also increase the chemical transferability; that is, whether a coarse-grained potential defined for a fragment in one molecule is appropriate for the same fragment in a different chemical environment. For example, a widely used generic coarse-grained forcefield in biomolecular molecular dynamics simulations is MARTINI, [53] but it still lacks significant accuracy for quantitative purposes. There is ongoing research to develop a generic forcefield for mineral-organic interfaces, where the adsorption properties of molecules is of significant interest. One avenue to achieve this will be to use feature extraction methods to determine a (unbiased) combination rule to reconstruct an accurate interaction potential within a tolerance level. However, we have to sacrifice accuracy (i.e., parameterization for each new interaction) for speed (using a general forcefield and combination rules). It is the focus of sensitivity analysis efforts to determine whether a generic forcefield can be accurate for properties of interest.
Finally, there is considerable interest in "reverse mapping" using AA/CG coupling from CG to AA. [54] However, it is not straightforward to reintroduce the previously eliminated degrees of freedom. Methods which have been considered include using restraints and least-squares fitting to remove high energy over-laps when recreating the atomistic system, and invoking libraries of conformations to generate atomistic structures (for example, see Ghanbari et al. [55] ). It is an active area of research to generate these reverse-mapped structures quickly and efficiently, such that it would be possible to move from the CG level of representation to AA and back again. This would allow "on-the-fly" parameterization and refinement in a single, concurrently coupled simulation.
In summary, we have shown that the longer time and length scales of multiscale AA/CG-MD allow the study of phenomena, such as the aggregation and dispersion of clay nano-particles in a polymer medium, that are inaccessible through atomistic simulation alone. Parameterization is required for each coarse-grained system, as transferring coarse-grained interaction potentials to different thermodynamic states or chemical environments cannot be relied upon to produce accurate results. Similarly, coarse-grained potentials matched to certain characteristics (such as structural properties) cannot be guaranteed to quantitatively match other properties (such as thermodynamic response functions). Addressing these issues is an active area of research, and we have outlined some new and novel approaches. Finally, we have shown how error propagation caused by coupling AA and CG-MD is very important and must be addressed; this is the focus of Section 5.

Understanding the Interaction between Graphene and Epoxy Resins
Polymer thermosets, such as epoxy resins, are an attractive candidate for building nanocomposites from graphene. Exhibiting apparent ductility in nanoscale simulations, [56,57] epoxy resins consistently exhibit a brittle response in engineering tensile testing. [58][59][60] In the transition from the atomistic scale to the bulk, epoxy resins lose their strain hardening capabilities and much of their resilience to large strains. Increased fracture toughness at smaller scales is a common phenomenon, [61,62] characterized by the Griffith's criterion, [63] which states that larger samples will statistically contain larger defects, or weaker links, and consequently, smaller samples usually exhibit higher fracture strengths than larger ones. Observing the transition between these regimes with simulation will require techniques that bridge multiple length scales.
The introduction of 2D nanoparticles (such as graphene) in these crosslinked polymer networks is expected to limit the predominant consequences of void growth, bringing about crack pinning, deflection, and dispersion. [64,65] In other words, 2D nanoparticles are expected to impede strain localization mechanisms.
Graphene is generally mixed with the polymer precursor (e.g., DGEBA, TGMDA) before curing, in proportions varying from low 0.02% [66] to high 6% weight ratios. [60,67,68] The Young's modulus is systematically increased, which is seen as a consequence of the nanoscale constraints, imposed by the relatively rigid inclusions. In light of these mechanisms, some authors believe the large surface area of the 2D particles is not so much the deciding www.advancedsciencenews.com www.advtheorysimul.com factor [65] as the degree of spontaneous curling [69] (seen in flakes longer than 10 nm).
The enhancement of fracture related properties is less clear. Fracture toughness of the epoxy resin is initially improved, but may then reach a plateau or even decrease at weight ratios above 1-2%. [60,66,67] Two different, potentially complementary theories have been formulated in this regard: crack deflection [60,67] and microcrack dispersion. [66] The former relates to a lengthened main crack path, whereas the latter favors a more diffuse cracking pattern. The dispersed microcracking explains the observed plateau, for which the density of microcracks actually facilitates the propagation of the main crack. Consistent observations have been made with other 2D nanoparticles such as clay [70] or silica. [71] The crack deflection theory does not predict significant strength alteration since, at the time of propagation, the strain associated with the peak stress is already overcome. Conversely, with an increased dispersion of the microcracking the material's load bearing capability should decrease before attaining the peak stress. Experimentally, some studies observe increased peak stresses, [68] inconclusive variations, [66] or even weakening of the nanocomposites. [67] The observations of weakening support the dispersed microcracking theory.
Graphene-based enhancement of the intermediate mechanical properties of epoxy resins, such as yield and hardening, are rarely addressed experimentally. The response in a dogbone sample of epoxy resin, subject to tensile loading, instantly transitions from elastic to brittle fracture. The intermediate properties are therefore not observable through standard engineering testing procedures. However, yield and hardening are highly relevant in the service use of structural components. They express the materials aptitude to store and return strain energy, under cyclic loading, while remaining functional.
Atomistic simulations of the nanocomposite revealed little difference in the responses between the neat and the grapheneenhanced epoxy resins under uniaxial tension. [72] The Young's modulus, Poisson ratio, yield stress, hardening, and strength show surprising similarity in the two cases. Our atomistic simulations led to the conclusion that a graphene nanoparticle embedded in a crosslinked polymer network acts mostly as a defect.

Optimized Heterogeneous Multiscale Method for Time-Dependent Materials
Loosely coupled hierarchical schemes are a common trend in multiscale modeling. The continuum-level constitutive equations are parametrized beforehand using lower-level molecular or coarse-grained simulations. [73][74][75][76] However, in trying to reproduce the wide variety of lower-scale mechanisms occurring, these empirically formulated constitutive equations suffer vast increases in complexity, becoming ill-defined and difficult to integrate. [77,78] Consequently, constitutive equations are reduced massively, to a simplified description of the true behavior of the material, potentially missing out important phenomena that appear due to non-trivial mechanical states at the microscopic level.
As a (partial) remedy to these problems, computational schemes have been developed that seamlessly couple multiple scales in a semi-concurrent fashion, implemented through the heterogeneous multiscale method (HMM). [79][80][81] A similar scheme built on two finite element models is the finite element squared (FE2) method. [82,83] The scheme has been classified as semi-concurrent [15] because the multiple models are run in parallel but solved separately, in contrast to concurrent methods which embed a discrete and a continuum description within a single domain. [84,85] HMM and other semi-concurrent methodologies have the benefit of being extremely scalable, since the lowerscale models are independent of each other and can therefore be simulated in parallel (and across separate compute resources, if requested).
The proposed workflow [86] computes the dynamic equilibrium of mechanical forces in a continuum structure using the finite element method (FEM). In a classical FEM approach the local constitutive relation between stresses and strains is a series of phenomenological mathematical equations, but in the present case it is replaced by an MD simulation Figures 6 and 7 respectively show a illustrative and flowcart representation of the workflow. An MD simulation, with nanoscale detail of the material structure, is performed, whenever the stress resulting from an applied strain history is required. In a nutshell, the application consists of a macroscopic FEM model which synchronizes the simulation of a large number of microscopic MD models, and is performed iteratively as time advances.
Implementations of HMM bridging such scales [87][88][89][90] are rare and have to date been limited to elastic mechanics. In an attempt to analyze or predict mechanical properties related to failure, fatigue, energy dissipation (and many other situations), capturing these inelastic mechanisms is of the utmost importance. In our recent work, we introduced modifications to the standard HMM implementation to capture history dependent mechanical behavior, including non-linear and irreversible mechanisms. [86] In turn, our approach is able to track the evolution of the structure at the atomistic scale associated with each continuum location. Although, the initial atomistic systems are identical, they diverge rapidly due to different mechanical loading history. The proposed workflow handles the large amount of data www.advancedsciencenews.com www.advtheorysimul.com and computations induced by time-dependency of the material structure.
As versatile as the HMM workflow may be, it remains constrained by load balancing issues. In a naive scheduling approach, the total allocation is split in equal suballocations to which the microscopic simulations are assigned, a priori. The unknown and variable execution time of the microscopic simulations leads to certain suballocations to complete ahead of others. We propose to address this using two mechanisms: i) a PilotJob Manager (PJM) internal scheduler, and ii) optimization of suballocation size. These have been developed as part of a European initiative to optimize the computational efficiency of multiscale computing application called ComPat (Computing Patterns for High Performance Multiscale Computing, http: //www.compat-project.eu/). The benefits to the speedup of an HMM simulation by these two mechanisms are displayed in Figure 9.
The PJM mechanism essentially consists [91] of an internal job scheduler for the large allocation provisioned for the whole set of micromodel simulations of a given iteration. The execution order of the jobs is specified in a First In First Out (FIFO) manner. The main advantage of the PJM is that all the jobs are gathered in a single queue, allowing execution on any subset of cores from the large allocation, as soon as the required resource size is available. Moreover, the size of this suballocation can be easily specified independently for each job, which helps when size adjustment is needed to reduce execution time disparities.
In order to reduce the variability of the micromodel simulations execution time, the suballocation size can be adjusted individually. Acknowledging that micromodel simulations ought to scale strongly up to certain number of cores, this can be achieved without any loss of efficiency, as long as the suballocation size remain in the strong scaling domain. The upper limit of the strong scaling domain can easily be attained from a scaling plot, our MD simulations using the LAMMPS modeling package, [10] since about 40 000 atoms typically scale strongly up to 200 cores. Subsequently, a simulations allocation size can be rescaled proportionally to their estimated execution time. The longest simulations are performed with the number of cores associated with the upper boundary of the strong scaling domain, while the shortest are performed with the lower boundary, that is, on one core.

Simulation of Polymer Thermosets with Graphene Nanoparticles
We performed heterogeneous multiscale simulations of a 150 mm × 100 mm × 5 mm shell of the novel nanocomposite transiently oscillating following a ballistic (see Figure 10). Our approach was able to capture the heterogeneity of the slab of nanocomposite at the atomistic as well as the continuum scale. The computations enabled us to simulate the behavior of the shell for more than 0.2 ms using 150 000 core hours. Using the scale separation assumption to simulate the shell is 10 26 (10 23 in space b) Asynchronous coupling of atomistic and continuum time scales prevents to evolve the atomistic system long after reaching relaxation. Reproduced under the terms of the CC-BY licence. [86] Copyright 2019, The Royal Society. Figure 9. Benchmarking of the HMM with various internal scheduling methods. Comparison between naive (a priori scheduling assignment with multiple internal queues), pjm (single queue internal scheduler) and pjm+opt (single queue and variable resource allocation) configurations. The application is run for five iterations of the HMC workflow. a) Influence of the internal scheduling method on the total runtime of the Nanomaterials application on an allocation size of 20 nodes. b) Detailed analytics from the pjm+opt configuration, that is time evolution of total allocation usage (top) and number of running microscopic model simulations (bottom). and 10 3 in time, in the case where 10 replicas are used) times less expensive than a molecular dynamics simulation that explicitly considers the whole system.
Multiscale analysis of the nanocomposite uncovered enhanced elastic capabilities due to inclusion of graphene particles. At relatively low amplitude oscillations, the nanocomposite is shown to dissipate a smaller portion, up to 70% less, of the en-ergy brought in by the impact in the form of strain energy. While the properties of the epoxy resin appeared hardly modified with the presence of graphene in uniaxial tension, under these 3D anisotropic loading conditions the presence of graphene was in fact observed to extend the elastic regime and reduce hysteresis effects. We have thus established that the graphene particle acts as a nanoscale constraint preventing conformational changes of Figure 10. a) The finite element model of a shell is impacted vertically in its center by a rigid cylinder travelling at 400 m.s −1 . The edges of the shell are fixed. The shell is either composed entirely of neat epoxy (g 0 ) or a mixture of neat epoxy and nanocomposites/defective epoxy (g 0 −g +∕−1 ). Each cell of the FE mesh is associated with a system type, with a random distribution of the g 0 (blue) and g +∕−1 (gray) systems across the shell. b) The deformed shell 1.5 × 10 −4 s after impact. The color indicates the amplitude of the vertical displacement. The mechanical wave induced by the impact propagates throughout the shell and reflects on fixed boundary conditions showing some interference patterns. d) Evolution of the total energy in the shell during the simulation. In addition to the neat epoxy (blue, g 0 ), three mixture configurations are tested: i) the neat epoxy is mixed with nanocomposite epoxy having all the sheets aligned in the plane of the shell (green, g 0 -g +1 ∥), ii) all the sheets orthogonal to the plane of the shell (purple, g 0 -g +1 ⟂), and iii) with the defective epoxy having all the defects aligned in the plane of the shell (red, g 0 -g −1 ∥). The dashed black line indicates the end of contact between the shell and the cylindrical impactor. The quantification of the total energy transferred from the impactor to the shell is shown in the left insert, while the total energy dissipation at the end of the simulation is found in the right insert. The g 0 −g +1 shell is able to retain almost the entire impact elastic energy independently of the orientation of the sheet, while the g 0 dissipates more than a third. Interestingly, when the sheets are removed, the energy dissipation returns to that of the neat epoxy shell. Reproduced under the terms of the CC-BY 4.0 licence. [72] Copyright 2019, Wiley-VCH.
the polymer network, with direct improvement of the shear modulus and strain energy restitution of the nanocomposite.

Time and Size Limitations to Molecular Dynamics
HMM enables independent simulation of the models at the different scales assuming scale separation (see Figure 8), yet not all materials display mechanisms clearly isolated at single scales. Epoxy resins, and more generally thermoset polymers, constitute very practical materials for the application of HMM. Indeed, the high degree of crosslinking of epoxy resins causes their atomistic structure to be homogeneous over a few nanometers, and prevents diffusion of polymer chains, thus allowing fast relaxation in a few picoseconds. For epoxy resins time and space characteristic lengths at the atomistic and continuum scales are separated by several orders of magnitude.
Conversely, for graphene, the separation of scales presents more of an issue. In the present work, we had to limit the scope of our simulations to short graphene sheets, a few nanometers wide, but exfoliated sheets often span over a few micrometers. Considering small flakes could potentially be an approximation too far for both calculating the mechanical properties of polymer-nanocomposite, and studying the dynamic mechanisms of exfoliation. In the current literature, there are two ap-proaches to explain the mechanism of graphene reinforcement of a polymer matrix. First, Marom and Wagner [92] argue that nanocomposites should be considered as molecular composites or self-reinforced composites. This approach takes into account the nanoparticles role as a nucleation site for crystallization and polymer confinement. However, it cannot explain the often disappointing reinforcement ability of graphene compared to the predictions made by the "rule of mixtures." An alternative explanation is proposed by Young et al. [93] who acknowledge that in no reported experiment has the reinforcement contribution of nanocomposite achieved a Young's modulus close to that of graphenes 1.0 TPa. Effective transfer of this exceptional stiffness to the polymer network is impossible. Instead, the composites stiffness is a function of the matrix's stiffness and the aspect ratio of the sheet. This description is derived from continuum shearlag theory and neatly explains that the reinforcement is due to the filler particles aspect ratio, not its stiffness. It is noteworthy that by this explanation, few-layer graphene (e.g., 5 sheets thick) is a better reinforcement particle than graphene itself because it will not crumple in the polymer and so retains a higher aspect ratio. Young et al.'s [93] theory explains the elastic modulus at low strains but does not account for graphenes effect on the composites toughness.
Particle-based simulation methods here face a serious challenge: to simulate graphene with an aspect ratio capable of giving adequate reinforcement, accurate descriptions of atoms and above 10 µm are needed. Simulating aspect ratios of over 10 5 are out of reach of current methods. Therefore, insights into the reinforcement mechanisms, exfoliation, dispersion, and processing will require new approaches. Coarse-grained simulations are a typical way of accessing longer time and length scales (see Section 3. This coarsening of particles necessarily loses detail; however, novel-shaped particles or mixed precision within the simulation offer some ways to preserve the aspect ratio. These techniques often run into problems of load balancing as the computational demand varies between simulation domains, but progress has been made recently in tackling this problem. [94] Another complicated case is multiscale modeling of thermoplastic polymers. Reptation of single polymer chains, each hundreds of nanometers long, require several microseconds to relax. In both graphene and thermoplastics, scale separation is hardly feasible as atomistic detail needs to be preserved in models attaining microscale dimensions. Only simulating a few picoseconds per MD simulation, the total computational cost of one of our HMM simulations reaches hundreds of thousands of core hours to simulate the impacted nanocomposite shell. Overall computational cost would directly suffer from longer polymer relaxation times. As MD simulations can only scale strongly up to a certain point, larger supercomputers are not a practical solution. However, steered or accelerated MD [95] might be serious candidates to tackle this problem. Our current HMM framework simulates concurrently two scales, atomistic and continuum, as shown in the graphene/epoxy application. Yet biomaterials, such as bone, [96] whale baleen, [97] or meta-materials [98] display more than two characteristic length scales. Accurate modeling of bulk polyethylene would require an atomistic description of polymer chains, a microscopic description of the arrangement of crystalline and amorphous phases, and a continuous description the engineering testing conditions. Similar to two-scale HMM, homogenization is used to transfer data from the lower to the adjacent upper scale, and projection to transfer data the other way around, and this is repeated as many times as there are connected single scale models. Such a coupling method is easily implemented as long as one continuous description is involved, as homogenization and projection are then straightforward. However, for large, heterogeneous molecular structures, that is graphene [99] or clay sheets (see Section 3), an intermediate particle-based description may be preferable (see Section 2). While bottom-up homogenization can be replaced by fitting CG-MD inter-particles potentials from AA-MD, it is far more intricate to deal with top-down data transfer. Projection of coarse-grained trajectories onto all-atom structures [100] do not find a unique solution, and so currently remains an area of limited interest.
Modeling many classes of realistic materials will typically require resolving processes occurring on more than just these two (separated) scales, such as by using techniques to bridge time scales within a single spatial scale (e.g., through application of alternating Monte Carlo and Molecular Dynamics [101] ) or by requiring a mesoscopic spatial scale to be inserted between the existing two and coupled in the same manner (strain passed downward, stress passed upward). This is likely necessary for a proper treatment of polymeric materials, which have important processes oc-curring on many scales (for more on this see the review by Li et al. [102] ). However, as the number of scales integrated within the HMM scheme multiplies, the number of independent, single-scale simulations can increase dramatically. In the current state of our HMM, where the upper-scale model requires a unique simulation at the lower-scale for each spatial discretization and at each time increment, considering more than two scales may well appear not to be computationally tractable. In the following section, we address our current developments relative to surrogate modeling, aiming at reducing redundancies in lower-scale model simulations.

Increasing Efficiency by Comparing the Mechanical State of Time-Dependent Materials
Naively initiating microscale simulations from the macroscale model often results in redundant calculations, with very similar input parameters. In terms of our coupled FE/MD model, this arises from recalculation of the stress tensor, , from an applied strain tensor, , that has already appeared. For example, in a linear elastic continuum, where all the quadrature points have the same properties, the stress will only need to be calculated once. However, when it comes to nonlinear inelastic mechanics, the mechanical state is not only defined by the current strain tensor, but by the entire local strain history. Indeed, outside of the elastic regime the mechanical response of a material depends on its loading history, not just the current strain state. Consequently, the number of parameters needed to define a microstate of the material grows rapidly with time, so that comparing the deformation history of two quadrature points presents a challenging problem.
When the stress is calculated using phenomenological constitutive laws, internal variables are introduced to account for timedependency. But in multiscale applications, the stress is calculated using an expensive microscopic model and the mechanical/deformation history cannot be reduced to one or a few internal variables. We are currently investigating reduction of the dimensionality of the strain history based on spline modeling. The evolution of the strain tensor with time can be fitted with a spline in a 7 D space (6 strain tensor components and time, see Figure 11). The number of parameters needed to describe a mechanical state then becomes constant and independent of the time span simulated. The similarity between two quadrature points is then quantified as an inverse of the distance between their associated splines. Once the similarity between quadrature points is estimated, an algorithm based on graph theory can be used to optimally reduce the number of micromodel simulations to perform. A larger number of simulations can be avoided by analyzing the dependencies in the graph of similarities. Without any interpolation, the number of simulations to perform could be drastically reduced using clever data-driven and order reduction methods.

Interpolation and Surrogate Modeling
While the constitutive relation of the material remains unknown, it can be assumed to be piecewise linear for small strain Figure 11. Reducing order of strain history. a) The time evolution of the six components of the strain tensor are fitted with six independent splines using 10 control points for each spline. The colored lines represent the full data, the dots the splines control points and the black lines the fitted splines. The strain history corresponds to a uniaxial tensile test.
increments. This linear operator relating the variations of the stress and strain tensors is a fourth-order tensor in its most general form (4). In addition to , the tangent (or instant) stiffness C tensor for a given mechanical state must then be computed using an atomistic simulation. Within a limited range of deformation from that mechanical state,C can be used to compute , rather than performing an atomistic simulation. In short, we have introduced the simplest form of constitutive equation parametrized using an atomistic model. Such a linear constitutive equation, unlike more detailed relations based on plasticity theory (for example) has a much narrower range of validity.
The fourth-order tensorC (4): is defined as the second partial derivative of the Gibbs free energy with respect to the strain tensor .
The benefits of such an interpolation technique are limited. In the case of inelastic mechanics, the current state of the system is dependent on the entire mechanical history. Consequently, when we exceed the valid range ofC (and a new simulation of the atomistic model is required), the full succession of deformation increments since the previous simulation has to be applied to the system. A linear interpolation of does not remove the need to evolve the atomistic system for the quadrature point where the stress is interpolated. Indeed, it is a requirement for later stress estimations using the atomistic model. And as the atomistic strain rate is constant, this evolution cannot be accelerated, therefore the gains remain limited. Nevertheless, the computational costs associated with starting a new simulation from a previous state (allocation of the resources and re-initialization) and, more significantly, with the homogenization of are avoided. Therefore, the interest in using such an interpolation method lies in the comparison of the cost of computing the instant stiffness tensor with the costs of restarting and sampling the atomistic simulations that were skipped.
Computing the homogenizedC tensor can be achieved in two different ways, either using its continuum mechanics definition, or from the pressure or deformation fluctuations of the atomistic system. [103][104][105] We shall focus briefly on the first approach as its computational cost can be quickly evaluated and related to the cost of sampling . The homogenization procedure ofC consists of loading the atomistic system with six linearly independent , after each of which the resulting is sampled, and a subset of the stiffness constants can be estimated using (4). The deformation is applied positively and negatively (extension and compression) along the strain orientation vector, and the finalC is obtained by averaging over both estimations. Note that the amplitude of the strain perturbation applied during each of these six states defines the range of validity ofC, with larger amplitudes giving a wider range of validity but less accurate estimates ofC. Overall, the homogenization ofC comprises the equivalent of twelve atomistic model simulations (deformation and sampling). This procedure is summarized in Figure 12.
Stochastic, data-driven methods also have the potential to reduce the number of atomistic simulations required. The HMM approach produces a significant amount of data, particularly as a result of the numerous atomistic models, but most of these simulations overlap (approximately) in the parameter space (or feature space in a machine learning context). Simple linear regression, or more sophisticated Gaussian process regression, [106] have been shown to produce excellent results in estimating the output of atomistic models based on the output of previous simulations in the framework of HMM applied to time-independent materials. [88,89] However, such approaches remain limited in the case of inelastic mechanics. Before such regression methods could be applicable to estimate , order-reduction methods would need to be applied. Similarly, the (N 3 ) scaling of the basic algorithm means that large training sets (resulting in large N) quickly render a GPR surrogate uneconomical with respect to simply running the MD micromodel, particularly in cases where each run lasts of the order of minutes. Using multiple GPR models to cover different domains of the same training space could be a way forward, and has indeed been helpful in cases confined to the elastic regime. [88] However, the much larger dimensionality introduced by history-dependent mechanics may yet impede tractable solutions in this context.

Perspectives
Throughout this report, we have focussed on the scale-bridging necessary for capturing the multiscale nature of these nanocomposite materials, and the development of individual tools and methods for particular materials. However, we now put this work into the context of a more general problem: how can the results of such materials modeling be trusted enough for use in actionable decision-making?

Lack of Reproducibility
In recent years, there has been an unprecedented focus on questions of reproducibility [107] and the perceived lack thereof across multiple fields of scientific research. This is an area within which computational science can and should excel, for example, via portable workflow descriptions, logging and archiving of results with comprehensive meta-data. However, reproducibility is merely one part of the wider issue of the certification of simulations for use in decision-making processes in an industrial or clinical context. Meaningful trust in the results equally necessitates validation of the simulation code against experimental data, verification of the correctness of the numerical solution, sensitivity analysis to determine the most influential input parameters, and uncertainty quantification for the output quantities of interest. The three concepts of validation, verification, and uncertainty quantification can be referred to under the umbrella acronym VVUQ. A number of libraries and tools have been developed to aid in the area of sensitivity analysis and uncertainty quantification. [108][109][110][111] We can see with relative ease the importance such considerations should have in the area of computational materials research. The sensitivity of MD simulations to, say, choice of forcefield, is important to the level of trust we may place in the results. Even small changes in dispersion energies can result in completely different behavior on the macroscale such as wetting properties and dispersion/aggregation, for example. The situation is further complicated by the difficulties in characterizing 2D materials: the flake size and shape. The density and type of defects are often unclear, or of a wide distribution. When 2D materials interact with biomolecules, the additional complexity introduced by the numerous different biochemical interactions that can occur makes macroscopic behavior even more challenging to predict from chemical detail. In turn, detecting trends relating input parameters to macroscopic properties is largely done by hand-waving. Tuning defects, or chemically controlling the surface could, for example, increase the dispersion by increas-ing the interfacial interaction or enhancing the interfacial stress transfer by chemically bonding polymer to the surface, or thermal conductivity for heat management applications. [112] However, such modifications can come at a price to the performance; for example, the properties of reduced graphene oxide are much reduced compared to pristine graphene. Creating new functional multicomponent 2D materials, therefore, can be viewed as a complicated multi-objective optimization problem. Automated testing tools to determine the accuracy and reliability of forcefields would further aid in full validation of a multiscale application.
The multiscale nature of many materials applications [86] further complicates this goal. Validation and verification of each submodel in the application does not imply validity of the full, coupled system of submodels. Similarly, naive non-intrusive sampling methods may not be an efficient route to quantifying uncertainties in multiscale applications, particularly with respect to HPC codes with wildly varying resource requirements, and may additionally miss pathological behavior introduced by complex (sometimes conditional) coupling of submodels. Semiintrusive Monte Carlo methods have been explored as a more efficient method for applications characterized by a large asymmetry in the computational expense of the submodels. [113] While there is no doubt that VVUQ methods are being increasingly used in computational science, particularly in the engineering disciplines, the uptake has nevertheless been patchy, and systematic, rigorous certification of simulations (such as in the published literature) cannot be said to be the norm in every field. This may reflect, to some extent, the lack of uptake in available tools, and perhaps also that current tools do not yet cover significant portions of use cases. In particular, we find that VVUQ on complex, multiscale workflows with diverse HPC requirements has not been fully addressed by existing tools. [114] This is due to the need to handle complex execution patterns across multiple HPC resources, rigorous handling of job failure, and efficient communication of high dimensional distributions, to name a few factors. In short, we do not simply require tools that provide implementations of existing UQ algorithms but instead provide a suite of building blocks that can be integrated in such complex workflows. Such a suite would result in a library from which efficient and tailored algorithms for a specific multicomponent application may be rapidly prototyped and tested, and that allow the granularity of execution control necessary for efficient execution across multiple HPC resources.
To this end, we are developing software tools to provide solutions to these problems, as part of the EU funded VECMA project (Verified Exascale Computing for Multiscale Applications, https: //www.vecma.eu/ na d https://www.vecma-toolkit.eu/). [114] The development of standard communication formats for the communication between solvers/scales of very high dimensional, multimodal distributions (and their compression), the very large data sets involved, and design of cheap surrogate models to replace expensive submodels are additional matters of concern.

Efficient Use of Emerging High-Performance Computing Resources
The advent of exascale computing brings with it a number of challenges for existing codes and workflows, with the main focus on www.advancedsciencenews.com www.advtheorysimul.com how we might scale these to most efficiently exploit the extreme parallelism of exascale platforms. Traditional approaches, such as strong scaling (increasing the node count with a fixed problem size) soon level out due to rapidly accruing communication costs, whereas weak scaling (problem size increases in proportion to available compute nodes) expands the spatial scales reachable, but not the correspondingly larger time scales necessary to resolve a given physical process. Use of accelerators (e.g., GPUs) can increase the time scales accessible by, for example, MD, but the scaling (at present) is nevertheless limited by the inefficiency of using multiple GPUs per MD job-an inefficiency we expect will eventually be addressed.
Performance modeling and predictions of N-body codes indicate severe limitations in the number of iterations (time steps) feasible on an exascale machine, defined here as 1 billion cores, within a reasonable period of computing time (for example, 1.5 days). [6] For a naive (N 2 ) MD algorithm, exascale is unlikely to benefit MD approaches in the aim of simulating larger systems for longer physical times. Methods with highly local interactions, such as lattice-Boltzmann (fluid dynamics), will scale more efficiently to very large core counts, although the output of a single run is now recognized to be of questionable scientific value, and a more efficient use of the resource should be sought.
In conclusion, there are clear limits to the time and length scales attainable for a given MD simulation that appear long before the number of processors is exhausted. It is clear that large, single jobs are not likely to be an efficient use of exascale resources for the computational techniques currently employed in materials research, particularly at the atomic or molecular scales.
However, when taking into account the absolute necessity of sensitivity analysis and uncertainty quantification for such work, a more practical approach emerges. The sampling of results from many individual runs, known as ensemble computing, is a characteristic component of many UQ methods. As there is no interreplica communication in these approaches, sampling methods may scale in a simple, embarrassingly parallel manner to the full use of one or more computing resources. As such, SA and UQ approaches that employ replica computing approaches lend themselves well to efficient exploitation of exascale computers. Sampling algorithms will need to be employed which take advantage of replica computings natural resistance to node failures, as compared to monolithic approaches.
This would apply particularly well to the HMM application described in Section 4, which already homogenizes local stress tensors via a bootstrapping of measurements from multiple replica MD simulations.

Validation of Multiscale Models via Specifically Designed Experiments
Certification of the simulation results also necessitates that the models be validated against experimental data. In the case of multiscale models, designing specific experiments to this end can often be non-trivial. Experimental evidence may be absent or not sufficient to validate each submodel separately, or sometimes even the fully coupled application (for example, in the case of simulating the operation of a new fusion reactor). In our area of 2D materials simulation, we wish to validate submodels ranging from the quantum regime (in which DFT and quantum Monte Carlo approaches reign) via the molecular regime (AA-MD and CG-MD) up to the continuum scale (FEM, etc.).
For 2D systems, at the quantum/atomistic level, the adsorption characteristics can often be determined by weak dispersion forces which, as we have seen in Section 2, are poorly described by theoretical methods, resulting in difficulties in calculating accurate adsorption energies. Even sophisticated quantum methods such as DFT suffer from various well-known shortcomings (notably self-interaction errors and problems with dispersion bonded systems), for which there has been work in ameliorating and understanding for water. [115] With respect to validation at longer length scales (generally relating to continuum, FEM solvers), a number of good candidate techniques exist. Full field strain measurements use digital image correlation to time resolve the local material displacement field [116] through speckling or X-ray computed tomography [117] which could be used to assess and compare with general deformation and cracking characteristics in simulated materials trajectories. This approach also allows estimation of local elastic properties, which could allow separate validation of the finite element submodel with respect to the molecular dynamics submodel. A standard set of experiments similar to the CARPIUC benchmark for cracking in concrete [118,119] would also enable systematic validation of our full multiscale application with respect to fracture in cured epoxy nano-composites. Close collaboration of computational modellers with experimentalists should inevitably lead to further, more specific measurements.

Aims and Perspectives
The exceptional mechanical, thermal, optical, and electrical properties of graphene and other 2D nanomaterials, along with the potential for bespoke chemical modifications to the surface, offer unique opportunities for engineering novel interactions that enhance existing materials, and also between living and synthetic biological materials. Without accurate predictive modeling tools for evaluating early-stage designs it will be difficult to exploit this potential.
We aim to ensure that multiscale modeling becomes a routine predictive tool within functional 2D materials research and development, integrated into both academic research and industrial design-processes. We intend that our predictions be certified through code verification and experimental validation procedures with academic and industry partners, based on careful uncertainty quantification (VVUQ). Such work will reduce the risk in developing new technologies and thereby encourage innovation.
One of the critical challenges faced in this area is how to efficiently manage the great computational expense of multiscale materials modeling in this context. Reduced cost surrogate models, in particular those exploiting the ascendance of machine learning techniques such as Gaussian Process Regression, will be key to this, although the high dimensionality of representations of history-dependent material behavior make this a nontrivial endeavour. With respect to uncertainty quantification, we www.advancedsciencenews.com www.advtheorysimul.com expect that the increasing availability of computing power and the advent of exascale computing will facilitate the application of both sampling and surrogate UQ approaches.
Through ongoing work on all these fronts, we seek to bring multiscale materials modeling to a more reliable position in the design of 2D nanocomposite materials.