GetDDM: an Open Framework for Testing Optimized Schwarz Methods for Time-Harmonic Wave Problems

We present an open ﬁnite element framework, called GetDDM, for testing optimized Schwarz domain decomposition techniques for time-harmonic wave problems. After a review of Schwarz domain decomposition methods and associated transmission conditions, we discuss the implementation, based on the open source software GetDP and Gmsh. The solver, along with ready-to-use examples for Helmholtz and Maxwell’s equations, is freely available online for further testing.


Introduction
We present an open-source framework for testing Schwarz-type domain decomposition methods for time-harmonic wave problems.Such problems are known to be computationally challenging, especially in the high-frequency regime.Among the various approaches that can be used to solve such problems, the Finite Element Method (FEM) with an Absorbing Boundary Condition (ABC) or a Perfectly Matched Layer (PML) is widely used for its ability to handle complex geometrical configurations and materials with non-homogeneous properties.However, the brute-force application of the FEM in the high-frequency regime leads to the solution of very large, complex-valued and possibly indefinite linear systems [45].Direct sparse solvers do not scale well for such problems, and Krylov subspace iterative solvers exhibit slow convergence or diverge, while efficiently preconditioning proves difficult [28].Domain decomposition methods provide an alternative, iterating between subproblems of smaller sizes, amenable to sparse direct solvers [55].
Among the different families of domain decomposition techniques, this work focuses on optimized Schwarz methods [29], which are well suited for time-harmonic wave problems [12,13,14,33,1,17,18,19,23,24,49,50,51] and can be used with or without overlap between the subdomains.The convergence rate of these methods strongly depends on the transmission condition enforced on the interfaces between the subdomains.The optimal convergence is obtained by using as transmission condition on each interface the Dirichlet-to-Neumann (DtN) map [48] related to the complementary of the subdomain of interest [47,46].For acoustic waves, this DtN map links the normal derivative and the trace of the acoustic pressure on the interface.For electromagnetic waves, it links the magnetic and the electric surface currents (and is referred to in this case as the Magnetic-to-Electric, or MtE, map) [23].However, using the DtN leads to a very expensive numerical procedure in practice, as this operator is nonlocal.A great variety of techniques based on local transmission conditions have therefore been proposed to build practical algorithms, both for the acoustic case [16,12,13,14,33] and the electromagnetic one [1, 17,18,19,23,24,49,50,51].Recently, PMLs have also been used for this same purpose [54,27,57,58].
The aim of this paper is twofold.First, it aims to provide a concise review of the most common transmission operators for optimized Schwarz methods applied to time-harmonic acoustic and electromagnetic wave problems, with the corresponding mathematical background.Second, it introduces the flexible finite element framework GetDDM ("a General environment for the treatment of Domain Decomposition Methods") to test and compare them, based on the open source software GetDP1 [21,22,35] and Gmsh2 [39,40].While GetDDM is written in C++, all the problem-specific data (geometry description, finite element formulation with appropriate transmission condition, domain decomposition algorithm) are directly written in input ASCII text files, using the code's built-in language.This general implementation allows to solve a wide variety of problems with the same software, and hides all the complexities of the finite element implementation from the end-user (in particular the MPI-based parallelization).Moreover, the software is designed to work both on small-and medium-scale problems (on a workstation, a laptop, a tablet or even a mobile phone) and on large-scale problems on high-performance computing clusters, without changing the input files.The complete implementation of all the techniques reviewed in this paper is freely available online on the web site of the ONELAB project3 [36,37], together with various 2D and 3D sample geometries, for both Helmholtz and Maxwell's equations.While other open source codes provide facilities for domain decomposition methods, either linked to finite element kernels (e.g.FreeFem++ [43] or Feel++ [53] via the HPDDM framework), or from a purely algebraic perspective (e.g.PETSc [3]), GetDDM adopts a complementary point of view.It focuses on Schwarz methods where the transmission conditions play a central role and provides a simple, flexible and ready-to-use software environment where the weak formulations of these transmission conditions can be automatically transcribed at the discrete level, with a direct link to their symbolic mathematical and physical structure.The authors hope that this approach will help the scientific community to test and compare different optimized Schwarz domain decomposition techniques by focusing on the mathematics of the transmission conditions, while not having to worry (too much) about the implementation.
The paper is organized as follows.The first section describes the acoustic and the electromagnetic scattering problems and the associated optimized Schwarz methods; the main transmission operators and the weak formulations are given, in view of their transcription in GetDDM.The next section describes the main features of GetDDM and provides code samples for the implementation of domain decomposition problems.Finally, the article concludes with some illustrative numerical results and perspectives for further development.

Optimized Schwarz Methods for Time-Harmonic Wave Propagation
The mathematical framework of optimized Schwarz methods is presented in this section together with a review of the different transmission operators, for both the acoustic and the electromagnetic cases.As the goal is to implement them using a finite element method, the weak formulations of the different problems are also provided.

Mono-Domain Problem
Let us consider an open subset Ω − of R d , where d = 1, 2, 3 is the dimension, with boundary Γ, such that its complementary Ω + = R d \ Ω − is connected.When illuminated by a time-harmonic incident wave u inc , the obstacle Ω − generates a complex-valued scattered field u, solution of the following problem, where the time dependence is implicit and of the form e −iωt , (1) The operator ∆ = d i=1 ∂ 2 x i is the Laplacian operator and k = ω/c is the real and strictly positive wavenumber (c = c(x) being the local speed of sound in the medium).In what follows a•b denotes the inner product between two complex-valued vectors a and b in C 3 , where z is the complex conjugate of z ∈ C. The associated norm is ||a|| := √ a • a.Here, a Dirichlet boundary condition on Γ has been set (i.e.we consider a sound-soft obstacle), but other conditions can be studied such as Neumann, Fourier or even penetrable obstacles.The outgoing condition stands for the Sommerfeld radiation condition (ı being the square root of −1) which ensures that, first, problem (1) is uniquely solvable and, second, that the scattered field u is directed from Ω − to infinity.
To solve problem (1) using the finite element method, the (unbounded) domain Ω + must be truncated, using for example a Perfectly Matched Layer (PML) [8,15] or a fictitious boundary Γ ∞ with an Absorbing Boundary Condition (ABC) [6,26] (see e.g.[2] for a review of different methods).Therefore, the aim is now to find the field û approximating u on the bounded domain Ω of boundary Γ ∞ Γ.To simplify, notations û and u are merged.The problem is now to solve the following problem: ( The unit normal vector n is directed outside Ω (and thus inside Ω − on Γ).For conciseness we choose the simplest local ABC, i.e., the Sommerfeld radiation condition at finite distance (zeroth-order condition): The extension to more accurate ABCs or PMLs is standard [41].

Domain Decomposition and Transmission Operators
Substructured formulation.The domain Ω is now decomposed into N dom disjoint subdomains Ω i (the substructures) without overlap.For every i = 0, . . ., To simplify, let D := {0, . . ., N dom − 1} be the set of indices of the subdomains, and for i ∈ D, let D i := {j ∈ D such that j = i and Σ ij = ∅} be the set of indices of the subdomains sharing at least a point with Ω i (we will call them the connected domains of Ω i ).Finally, for all i ∈ D, the unit normal n i is directed into the exterior of Ω i and thus inside the obstacle Ω − (if Γ i = ∅).
The additive Schwarz domain decomposition method can be described as follows, at iteration n + 1: 2. For all i ∈ D and j ∈ D i , update the interface unknowns according to: The operator S is the invertible transmission operator, which will be detailed later.The (n+1) th iteration of the above algorithm can be rewritten in the following more compact form: 1.For all i ∈ D, compute the volume solution u n+1 i of (4), written as i∈D,j∈D i is the vector collecting all the interface unknowns.
2. For all i ∈ D and j ∈ D i , update the surface fields g n+1 ji following (5), written as In (4) we have only considered the case of Dirichlet sources; other kinds such as volumic sources are of course possible and must be treated in the same way through the algorithm.We will refer to them as physical sources, as opposed to the artificial sources on the transmission boundaries.
The algorithm described by ( 4) and ( 5) can be interpreted as a Jacobi iteration applied to a linear operator equation.Indeed, for every n ∈ N and by linearity, the field u n+1 i can be decomposed as , where and The quantity v n+1 i is independent of the iteration number n and can hence be written as 5) then becomes Let us introduce the vector b = (b ji ) i∈D,j∈D i , with b ji = 2(Sv i )| Σ ij , and the operator A : One iteration of the domain decomposition algorithm then corresponds to which is one iteration of the Jacobi method applied to the system where I is the identity operator.Any iterative linear solver can be applied to (10), as for example Krylov subspace methods such as GMRES [52].When using a Krylov subspace solver, the method is called a substructured preconditioner [30].
It is worth noting that the iteration unknowns in (9), (10) are the surface quantities g and not the volume unknowns u.Obtaining the volume quantities from the surface unknowns corresponds to solving u i = V i (u inc , g), on every subdomain Ω i .A summary of the Schwarz method with Krylov solver is given in Algorithm 1.
Algorithm 1 Schwarz algorithm with Krylov solver.
1. Compute the right hand side b: 2. Solve system (I − A)g = b iteratively using a Krylov subspace solver, where A is defined by (8).
3. At convergence, compute the final solution: ∀i ∈ D, u i = V i (u inc , g).
Transmission operator.The convergence rate of the iterative solver is strongly linked to the choice of the transmission operator S [13].The optimal transmission operator would be the Dirichlet-to-Neumann (DtN) map for the complement of each subdomain [47,46], but this operator is however non-local, which makes it very expensive to use computationally.Different local approximations have hence been proposed, based on polynomial or rational approximations of the total symbol of the surface free-space DtN, or by using a volume representation through Perfectly Matched Layers.Among those approximations, four are detailed below and are implemented in GetDDM for a generic transmission boundary Σ: • Evanescent Modes Damping Algorithm [12,14]: where χ is a real constant.This zeroth-order polynomial approximation is a generalization of the Després condition [17], for which χ = 0.In what follows, we will denote this family of impedance transmission conditions as IBC(χ).
• Optimized second-order transmission condition [33]: where ∆ Σ is the Laplace-Beltrami operator on Σ, and a and b are two complex numbers obtained by solving a min-max optimization problem on the rate of convergence.This condition corresponds to a second-order polynomial approximation of the DtN symbol.In what follows, we will denote this family of generalized impedance transmission conditions as GIBC(a, b).A zeroth-order optimized condition can be constructed in a similar way.
• Padé-localized square-root transmission condition [13]: where with ε = 0.39k 1/3 H 2/3 , H being the local mean curvature of the interface.The coefficients C 0 , A and B are given by where α is a rotation angle in the complex plane (usually taken as π/4) and R Np are the standard real-valued Padé approximation of order This transmission condition corresponds to a rational approximation of the DtN symbol.In what follows, we will denote this family of generalized impedance transmission conditions as GIBC(N p , α, ε).
• PML transmission condition [54,27,57,58]: The operator S PML(σ) is constructed by appending a layer Ω PML to the transmission interface, in which a PML transformation with absorption profile σ is applied.For example, in cartesian coordinates, the profile can be used, where δ is the thickness of the PML layer and x PML is the local coordinate inside the PML [9,44].
All these methods are referred to as optimized Schwarz domain decomposition methods.Note that GIBC(N p , α, ε) and PML(σ) have in common that they introduce additional unknowns, whereas the other two transmission conditions do not.Also, the first three transmission conditions can be formulated explicitly through sparse surface equations (see e.g. the weak formulations ( 18)-( 23) below), while a sparse formulation of the PML transmission condition requires a volume representation (see e.g. ( 24)-( 25)), a surface representation being dense [56].

Weak Formulations
The finite element method is based on the weak formulations of the partial differential equations.Two different kinds of PDEs are considered when using optimized Schwarz methods: a volume system (here the Helmholtz equation) represented by the operators V i , and a surface system on the transmission boundaries, represented by T ji .For the sake of clarity, the weak formulations are first given for a generic transmission operator S. For conciseness, we develop the case where there is no contribution on ∂Σ ij through integration by parts.Nevertheless, in some situations (e.g. when Σ ij Γ ∞ = ∅), care should be taken to include these terms into the weak formulations.

Generic weak formulations.
Without loss of generality, only the case of a particular subdomain Ω i , for i ∈ D, with no incident wave (homogeneous Dirichlet boundary condition) is detailed.We consider the general setting where PML layers Ω PML i = ∪ j∈D i Ω PML ij are potentially appended to the artificial interfaces Σ ij , and define In what follows, the space , which slightly differs from its usual definition (the Dirichlet condition is here set only on part of ∂Ω * i ).
• The volume PDE u n+1 i = V i (0, g n ) has the following weak formulation: : • And the surface PDE ) has the following one: On the transmission boundaries.Depending on the choice of the transmission operator S, the quantities g ji dΣ ij expand as follows: • IBC(χ): • GIBC(a, b): • GIBC(N p , α, ε): where, for every = 1, . . ., N p , the function ϕ is obtained through the resolution of • PML(σ): where , that is, we consider a 1D PML with an absorption function that grows only in the direction normal to the interface.In the last expression the domain of definition of the test functions g ji on Σ ij is extended to the neighboring PML layer Ω PML ij , effectively resulting at the discrete level in the integration of the functions associated with the nodes of the interface in the layer of volume elements connected to the interface.

Mono-Domain Problem
The case of an electromagnetic wave is now considered for an obstacle Ω − with a smooth boundary Γ and a three dimensional medium.When illuminated by an incident electric field E inc , a perfectly conducting body Ω − generates a scattered field E, solution of the following exterior electromagnetic scattering problem: where k := 2π/λ is again the wavenumber and λ the wavelength, n is the outward unit normal to Ω + (thus, inward to the obstacle) and γ T is the tangential component trace operator The curl operator is defined by curl a := ∇ × a, for a complex-valued vector field a ∈ C 3 , and the notation a × b designates the cross product between two complex-valued vectors a and b.The last equation of system (26), which is the so-called Silver-Müller radiation condition at infinity, provides the uniqueness of the solution to the scattering boundary-value problem (26).
As for the acoustic case, numerically solving problem ( 26) with a volume discretization method requires the truncation of the exterior propagation domain with a fictitious boundary Γ ∞ surrounding Ω − .The problem to be solved is then defined on the bounded domain Ω, with boundaries Γ and Γ ∞ : with γ t the tangential trace operator: As above, the unit normal n is outwardly directed to Ω and, to simplify, the solution of the above problem is still designated by E. The operator B is an approximation of the Magnetic-to-Electric (MtE) operator.Several expressions of B can be found in the literature [23]; for conciseness we restrict ourselves to the simplest one, the well-known Silver-Müller ABC at finite distance: B = ık, similar to (3) for acoustics modulo the sign (due to the trace operator definitions).

Domain Decomposition and Transmission operators
Substructured formulation.The optimized Schwarz domain decomposition without overlap is now considered for Maxwell's problem (27).The domain Ω is decomposed as described in paragraph 2.1.2and the same notations are used (recalling that D = {0, . . ., N dom − 1} and, for i ∈ D, D i = {j ∈ D s.t.j = i and Σ ij = ∅}).The iterative Jacobi algorithm for the computation of the electric fields (E n+1 i ) i∈D at iteration n + 1 involves, first, the solution of the N dom following problems and then forming the quantities g n+1 ji through where, for i ∈ D, E i = E |Ω i , S is an inversible transmission operator through the interfaces Σ ij and γ t i and γ T i are the local tangential trace and tangential component trace operators: with n i the outward-pointing unit normal to Ω i .Following the same procedure as in section 2.1.2,we introduce the two families of operators (V i ) i∈D and (T ji ) i∈D,j∈D i as: (28), where g n = (g n ji ) i∈D,j∈D i collects all the unknowns at iteration n; is solution of problem (29).
By linearity, we decompose the field , where The quantity F n+1 i is independent of the iteration number n and can hence be written as The whole algorithm can then be recast into a linear system: that can be solved by a Krylov subspace solver.
As in the acoustic case, for a vector g n , the quantity Ag n is given by, for i ∈ D and j . The information about the incident wave is contained in the righthand side: b ji = T ji (0, F i ).The domain decomposition algorithm for the Maxwell's equation is then exactly the same as the one described in Algorithm 1 for the Helmholtz equation, by formally replacing v i , u inc , g and u i by F i , E inc , g and E i , respectively.
Transmission operator.Similarly to the acoustic case, optimal convergence of the domain decomposition algorithm would be achieved by using the (non-local) MtE operator as transmission condition.Local approximations based on polynomial or rational approximations of the total symbol of the surface free-space MtE have been proposed, as well as volume representations through Perfectly Matched Layers.Among those approximations, four are detailed below and are implemented in GetDDM for a generic transmission boundary Σ: • Zeroth-order transmission condition [17]: • Optimized second-order transmission condition [51]: where the curl operator is the dual operator of curl and where a and b are chosen so that an optimal convergence rate is obtained for the (TE) and (TM) modes; see [51] for the expression of a and b in the half-plane case.A zeroth-order optimized transmission condition using a single second-order operator was proposed in [1].
• PML transmission condition [57,58]: The operator S PML(σ) is constructed by appending a layer Ω PML to the transmission interface, into which a PML transformation with absorption profile σ is applied in the same way as for the acoustic case.

Weak Formulations
Generic weak formulations.Without loss of generality, only the case of a particular subdomain Ω i , for i ∈ D, with no incident wave (homogeneous Dirichlet boundary condition) is detailed.We consider the same general setting as in the acoustic case, i.e., where PML layers are potentially appended to the artificial interfaces Σ ij , and define . The space of complex-valued curl-conforming vector fields on • The volume PDE E n+1 i = V i (0, g n ) has the following weak formulation: Find E n+1 i ∈ H 0 (curl, Ω i ) such that, for every E i ∈ H 0 (curl, Ω i ): • The surface PDE ) has the following one: On the transmission boundaries.
• IBC(0): • GIBC(a, b): where the function r ∈ H(curl, Σ ij ) is obtained through the solution of • GIBC(N p , α, ε): where the function r ∈ H(curl, Σ ij ) is obtained through the solution of Find r in H(curl, Σ ij ), and for = 1, . . ., N p , ϕ in H(curl, • PML(σ): where the tensor D is defined as for the acoustic case and the test functions g ji are again extended to the volume of the PML layers.

Implementation in GetDDM
GetDDM is based on the open source finite element solver GetDP (http://getdp.info) and the open source mesh generator Gmsh (http://gmsh.info).The complete implementation of all the techniques reviewed in this paper is freely available online on the web site of the ONELAB project [36,37], at the following address: http://onelab.info/wiki/GetDDM.Various 2D and 3D test-cases are provided online (see also Section 4) for both Helmholtz and Maxwell's equations, as well as detailed instructions on how to build the software for parallel computer architectures.Pre-compiled, serial versions of the software for Windows, MacOS and Linux are also available for development and testing.
By default, GetDDM uses Gmsh [39,40] for geometry description, mesh generation and data visualization.While any other CAD system and mesh generator can also be used, Gmsh provides a tight integration with GetDP through the ONELAB interface, which permits a seamless modification of the various solver parameters.In what follows, as we focus on the behavior of the domain decomposition methods with various optimized transmission conditions, we will consider relatively simple geometrical configurations, where the subdomain partitioning is carried out at the level of the CAD.Also, no special treatment of cross-points is considered.More complex configurations can however be treated-see Section 4.
Once a suitable finite element mesh is generated, GetDDM uses GetDP to assemble and solve the finite element problem in parallel.GetDP (a "General environment for the Treatment of Discrete Problems") [21,22,35] uses mixed elements to discretize de Rham-type complexes in one, two and three dimensions.Its main feature is the closeness between the input data defining discrete problems and the symbolic mathematical expressions of these problems, translated into ten inter-dependent objects.As mentioned in the introduction, while GetDP is written in C++, the problem definition is directly written in input ASCII text files (.pro files), using GetDP's own problem definition language.This allows for example to write weak forms of (28) together with either (32), (33) or (34) directly in the input data files, and use the natural mixed finite element spaces suitable for discretization [34,23].In practice, a problem definition written in .proinput files is usually split between the objects defining data particular to a given problem, such as geometry, physical characteristics and boundary conditions (i.e., the Group, Function and Constraint objects), and those defining a resolution method, such as unknowns, equations and related objects (i.e., the Jacobian, Integration, FunctionSpace, Formulation, Resolution and PostProcessing objects).The processing cycle ends with the presentation of the results, using the PostOperation object.This decomposition points out the possibility of building black boxes adapted to the treatment of general classes of problems that share the same resolution methods-typically what is done in GetDDM, where, as will be explained below, the Schwarz.pro and Decomposition.pro files contain the generic domain decomposition algorithms, and the Helmholtz.proand Maxwell.procontain the physics-specific function spaces and weak formulations, used in all the other particular problem definition files (e.g.waveguide3d.proor marmousi.pro).Of particular significance is that the input .profiles are not interpreted at run-time: they are analyzed once before the computation is run; then the computation (finite element assembly, system solution, etc.) is carried out fully automatically in compiled code, the parallel linear algebra being handled by the open-source PETSc solvers [3,4,5].
The remainder of this section covers in detail the implementation in the GetDP language (in .profiles) of the domain decomposition algorithms presented in Section 2, from the weak formulations to the parallel iterative solution of the resulting linear systems.The focus is on advanced techniques required to implement the domain decomposition algorithms efficiently; for an introduction to GetDP and its various more elementary features, the reader is referred to the GetDP reference manual [20] and is encouraged to explore the numerous examples available on the website of the ONELAB project (http://onelab.info).In a similar way, the elementary geometry and meshing features of Gmsh are not described here: the reader is referred to the Gmsh reference manual [38] and the various .geofiles provided on the ONELAB site for additional information.

Weak Formulations and Finite Element Discretization
Before describing the weak formulations, we briefly introduce the discrete functional spaces based on the appropriate finite element functions available in GetDDM.For the Helmholtz problem with a Dirichlet boundary condition, we naturally consider the classical linear finite element (P1) space, using the built-in BF_Node basis functions.The boundary conditions are imposed strongly as constraints on the coefficients in the expansion of the approximate solution in terms of the finite element basis: see Listing 1. Going to second-order using hierarchical polynomials would simply consist in adding another BasisFunction in the FunctionSpace; in addition to the nodal degrees of freedom, the finite element expansion now also uses degrees of freedom associated with the edges of the mesh: see Listing 2. Discrete function spaces for surface unknowns are written similarly, the only difference being the Support.For example, the approximation space for the auxiliary functions ϕ in H 1 (Σ ij ) in ( 22) is simply constructed by specifying Support Sigma~{idom} in the FunctionSpace.
Even if the Maxwell problem is more complicated, its implementation in GetDDM is also quite straightforward.Indeed, the natural approximation space for the vector unknowns is the Whitney edge element space [11], which is directly available through the BF_Edge basis functions (see Listing 3), for both the volume and surface unknowns.More details about the functional framework for the Maxwell problem can be found in [25].Let us also remark that, for the sake of conciseness, we choose to present the standard weak formulations for the Helmholtz and Maxwell's equations.Other formulations (dual, mixed), associated finite element spaces (Raviart-Thomas, discontinuous, ...) and other equations (elastodynamics, Schrödinger, ...) could be developed as well in the same GetDDM framework.
Following Algorithm 1, for every subdomain, the weak formulation solved at each iteration remains the same, up to the right-hand side.The input file contains only one weak formulation, indexed by the subdomain number idom and encapsulated in a For loop.All the considered transmission conditions are also regrouped in the same file and selected according to the TC_TYPE parser variable 4 .The weak formulation for the volume system ( 16) is presented in Listing 4 without the transmission boundary condition terms ( 18)- (24), which are detailed in Listings 5-8 respectively.In the Formulation object [.,.] inside a Galerkin term denotes an inner product for building linear or bilinear forms, with test functions by convention to the right of the comma.Unknown quantities (in bilinear forms) are marked with the Dof{} operator; d represents the exterior derivative.The $PhysicalSource run-time variable (resp.$ArtificalSource) specifies whether the physical (resp.transmission) boundary condition is set or not.The iSide parser variable is used to indicate the "left" or the "right" transmission boundaries of a given subdomain (see Section 3.2.1),which is necessary for sweeping-type preconditioners (see Section 3.3) [57].In the absence of preconditioning, the "left"/"right" distinction is not mandatory, and all the quantities and domains can be regrouped in either one.In the same way as for the volume part, the surface equation ( 17) is also directly transcribed as a Formulation in the input file: the relevant Equation is presented in Listing 9. Note that, for the GIBC(N p , α, ε) condition ( 22), the auxiliary quantities, such as r, ϕ and ρ have already been computed during the volume resolution (Listing 7) and are here re-used (without Dof{}).The quantities g_in refer to the (incoming) functions g ji (16) (see Section 3.2.2).Finally, for conciseness, the part of the implementation which is standard in GetDP is not detailed here, that is for example the definition of functions (Function objects) or the construction of integration rules (Jacobian and Integration objects).Vector [0 ,0 ,0] , { e ~{ idom }} ]; In Sigma ~{ idom }~{ iSide }; Integration I1 ; Jacobian JSur ; } EndFor // transmission condition Listing 10: Volume system of the Maxwell equation (35).(Code extracted from the file Maxwell.pro.) The implementation of the weak formulations for the Maxwell problem is similar.For example, the volume PDE formulation is presented in Listing 10.For the detailed implementation of the transmission conditions, see the Maxwell.prosource file.

Domain Decomposition Solver
The domain decomposition method is naturally suited for parallel computation on distributed memory architectures, using e.g. the Message Passing Interface (MPI).A classical and efficient approach is to use, when possible, as many MPI processes as subdomains.The i th process is then associated for example to Ω i and is then in charge of solving the volume PDE on Ω i and computing every outgoing data (g ji ) j∈D i .(Each MPI process can of course be multi-threaded in order to benefit from shared memory parallelism-this is handled automatically in GetDDM through the use of appropriate BLAS libraries.)On the other hand, solving these problems requires the knowledge of the incoming quantities (g ij ) j∈D i and, as the memory is distributed, an exchange of information with each process of rank j ∈ D i must be achieved, leading to Algorithm 2. The inter-process communications are handled transparently by GetDDM in the IterativeLinearSolver function (see Section 3.2.3)and via the explicit SetCommSelf/SetCommWorld commands.By default, GetDDM uses a broadcast to exchange information between the processes; this can be replaced by a more efficient communication pattern by explicitly specifying the subdomain connections, as explained in Section 3.2.2.To summarize, solving the domain decomposition problem with GetDDM involves these main steps: 1. Create a mesh for each subdomain.
3. Distribute the subdomains and the unknowns to the MPI processes.
4. Optional: specify the topology to improve communications.

Setup the iterative solver.
Points 1 and 2 have been briefly described in the previous section and are unchanged compared to standard (non-parallel) programs, for which many examples can be found online on http: //onelab.info.Only the new points 3, 4 and 5 are hence detailed in the following paragraphs.Each point will first be described, then illustrated by an example and a code listing, based on a simple waveguide-like structure divided into slices.
Algorithm 2 Domain decomposition algorithm from the point of view of the local i th MPI process (Helmholtz case).

Compute the local contribution to the right hand side: b
3. Exchange information about the (geometric) topology between processes.
4. Enter the parallel iterative Krylov subspace solver for the system (I − A)g = b; at each iteration, compute the local contribution (Ag n ) ji through the following sequence of operations: • Send quantity g n ji to and receive g n ij from connected processes j ∈ D i .• Solve the local volume problem: • Solve the local surface problems: 5. Send quantity g ji to and receive g ij from connected processes j ∈ D i .
6. Compute the local contribution to the final solution by solving the volume problem:

Distributing the Subdomains and the Unknowns
In GetDDM the unknown g ji is called a Field, which is a function that can be interpolated using a finite element basis on its domain of definition (and which is set to zero elsewhere).
Distributing the subdomains to the different MPI processes is straightforward.Special consideration is however needed for the Fields, as they must be indexed using a global numbering scheme.This numbering is chosen by the user: it amounts to choosing a unique integer identifier for every pair (j, i).Once the numbering is decided upon, two local (per MPI process) lists must be constructed containing respectively the indices of the subdomain(s) and the (global) indices of the Fields that the current process is in charge of.In this paper, these two lists are named respectively ListOfSubdomains and ListOfFields.A generic implementation for simple layered-like decompositions is provided in the Decomposition.profile.Let this point be clarified by a simple example on a one-directional waveguide, divided into N dom subdomains, from "left" (0) to "right" (N dom − 1) as depicted in the table below (each cell being a subdomain): To simplify, the number of MPI processes is set to N dom and the domain Ω i is handled by the MPI process of rank i, which is done in practice by an If statement on the MPI rank.Now, about the unknown g ji : a subdomain has two connected subdomains except the first and the last one (Ω 0 and Ω N dom −1 ), which have only one adjacent subdomain.Every process will hence be in charge of either 2 Fields (the "interior" subdomains) or 1 Field (the first and last one).
A simple choice for the global numbering of these Fields is to go from 0 to 2N dom − 3 with a step of 1, as follows: Global index of (j, i) The contents of the local lists ListOfSubdomains and ListOfFields are then (the brackets [•] recall the list nature of the quantities): A generalized input .profile for this example is presented in Listing 11 (taken from the file Decomposition.pro).Let us point out the following remarks about the above implementation: • The operation List += a appends a at the end of the list List; #List returns the size of the list List.
• The sizes of ListOfSubdomains and ListOfFields can differ from one MPI process to another.
• Listing 11 is not restricted to N dom MPI processes.Indeed and for example, for 4 subdomains and 3 MPI processes, the processes 1 and 2 will be in charge of subdomains 1 and 2, but the process of rank 0 will be in charge of subdomains 0 and 3.In that case and for process 0, ListOfSubdomains = [0, 3] and ListOfFields = [0, 5].Sequential execution (1 MPI process) is therefore also automatically handled.
• Generally and as in the example above, a Field represents one and only one unknown g ji .However, there is no restriction and a Field can actually represent two or more unknowns.This can lead to a simpler algorithmic implementation, but can degrade parallel performance as it increases the amount of communication (e.g.i th process would send the field g ki to process j instead of only g ji , for j, k ∈ D i , j = k).

Enhancing the Communication Process
By default the communication of the Fields between the MPI processes is global and realized through an MPI_Broadcast.This is clearly inefficient for large scale problems and can be remedied by explicitly specifying the connections between the Fields, in which case efficient asynchronous MPI_Isend and MPI_Irecv communication is used.Two Fields are said to be "connected" when one is needed to compute the other.For example, the Fields storing g ij and g ji are connected (through equation ( 5)).For the simple one-directional waveguide, the following table provides the connectivity of the Fields: In practice, the indices of these connected Fields are contained in a third list, named here ListOfConnectedFields.As suggested above, a Field can however store more than one unknown g ji .In the waveguide example, the two unknowns g 0,1 and g 2,1 could be stored in a single Field numbered 1, without changing the other Fields' indices.In that case, Field 1 would be connected to the two Fields 0 and 3. ListOfConnectedFields is actually a concatenation of sublists, each one being preceded by its size.
This point is clarified by the following example, which completes the one from Section 3.2.1.The two lists ListOfSubdomains and ListOfFields remain unchanged and a new list, named ListOfConnectedFields, is added, composed of the number of connected Fields (written in A general implementation for layered decomposition is provided in Listing 12, taken from the file Decomposition.pro.This code is again written for an arbitrary number of MPI processes.Note that, to be used in the weak formulations, a Field must be called through the command ComplexScalarField or ComplexVectorField, depending on its nature.In Listing 12 and to simplify, the connected Fields are stored in the quantities g_in (see also Listing 4).

Handling the Iterative Solver
The iterative algorithm uses the built-in GetDDM function IterativeLinearSolver, which takes as argument the operations that implement the matrix-vector product.(Internally, IterativeLinearSolver is based on PETSc, which allows to transparently interface a large collection of parallel iterative solvers5 .) Listing 13 presents the Resolution object from the generic Schwarz.proexample file, using the lists described above.The implementation uses a series of Macros (called with Call), defined in the SchwarzMacros.profile.The macros EnablePhysicalSources and DisablePhysicalSources modify the run-time variable $PhysicalSource introduced in Section 3.1 (and update the values of the physical constraints accordingly); the macros Enable-ArtificialSources and DisableArtificialSources modify the run-time variable $ArtificialSource.The three other macros (SolveVolumePDE and SolveSurfacePDE and UpdateSurfaceFields) solve the volume and the surface problems and update the Fields, respectively.They are presented in Listings 14, 15 and 16.

Other Features
As seen above, GetDDM provides quite a general environment for the solution of Schwarz-type domain decomposition problems: the flexibility in the specification of discrete function spaces and weak formulations allows to handle a variety of physical problems and geometrical configurations; and the generic linear algebra (through IterativeLinearSolver) and communication mechanism (using Fields) permits to implement different variants of Schwarz methods.In addition to the basic functionality described above, the following features are also available: Coarse grids and preconditioning.The IterativeLinear solver can handle both left and right preconditioners: they can be specified in between a second pair of braces after the operations implementing the main matrix-vector product-see Listing  (Code extracted from the file SchwarzMacros.pro.) of subdomains.Such preconditioners are often referred to as coarse grids, a vocabulary coming historically from mechanics where, indeed, a coarse grid is built with e.g. one point per subdomain, and a global PDE is solved on it in order to transfer low-frequency information over the whole domain.While efficient preconditioners are well-known for Laplace-type PDEs, these classical coarse grids fail for wave-type problems.Designing preconditioners for high-frequency time-harmonic wave problems is an active research field, with recent promising advances in sweeping-type approaches [27,54,57].GetDDM can be used to test these preconditioners, and several variants of sweeping preconditioners are implemented in the Schwarz.profile, with various degrees of parallelization [58].
Overlapping decompositions.Using overlaps between the subdomains can significantly improve the convergence of DDMs [32,29], albeit at the additional cost of larger volume subproblems.Such overlapping decompositions are directly handled with GetDDM at the level of the geometrical description and the mesh; the only modification in the formulations is at the level of the surface field updates like (5), where the Neumann data must be evaluated explicitly since it can no longer be eliminated, the interfaces Σ ij and Σ ji being distinct from each other.
Non-conforming meshes.GetDDM makes it easy to handle non-conforming meshes (Mortar methods [7,10] or NICEM method [31,42]), as the data between subdomains is exchanged in the form of Fields, which are interpolable.Non-conforming meshes can be useful in a variety of settings, e.g. for multi-physics couplings or adaptive simulations.Another practical interest is linked to parallel mesh generation: relaxing the conformity on interfaces allows for coarse-grained, embarrassingly parallel mesh generation, where each subdomain is handled by its separate CAD representation.The waveguide3d.geofile provides an example of such an approach.
Other formulations and physical problems.In addition to the standard formulations presented in this paper for the Helmholz and the Maxwell problems, various other formulations can be readily implemented in the .profiles: mixed formulations (e.g.pressurevelocity formulations for Helmholtz), dual formulations (e.g. in terms of the magnetic field for Maxwell), etc. Time-dependent and non-linear problems can also be investigated using standard features of the underlying software, as can be applications to other physical problems (e.g.elastodynamics or Schrödinger).Finally, several coupled, multi-physics problems (e.g.electro-mechanical) have been tested as well.
3. Open the models/ddm_waves/main.profile with the File>Open menu.

Click on Run.
Different test-cases can be chosen by selecting the appropriate Problem in the menu to the left of the graphics window.Top-level parameters (type of transmission condition, number of subdomains, frequency, etc.) can be changed interactively in the menu as well.This interactive use of GetDDM is useful for testing, demonstration and visualization purposes.For actual, parallel computations, you will need to recompile the GetDDM source code for your particular computer architecture and MPI implementation.Detailed installation instructions are available on the website of the GetDDM project, at the address: http: //onelab.info/wiki/GetDDM.Once compiled, GetDDM is then called from the command  with concentric or radial subdomains [13,25].e., f.: guided acoustic or electromagnetic waves in rectangular waveguides [57].g.: guided acoustic or electromagnetic waves in the COBRA benchmark defined by the JINA98 workgroup [58].h.: acoustic waves in the underground Marmousi model [54].
line, using (depending on your MPI setup), commands similar to the following (here for the waveguide3d test-case, on 100 CPUs): mpirun -np 100 gmsh -setnumber N_DOM 100 waveguide3d.geompirun-np 100 getdp -setnumber N_DOM 100 waveguide3d.pro-solve DDM Sample scripts for running large scale computations on computer clusters using the SLURM or PBS job schedulers are also provided in the models/ddm_waves directory.In all cases (interactive, serial or parallel), the input files and the workflow are the same: a geometry (file.geo) is constructed, meshed and partitioned into subdomains using Gmsh; then the problem definition (file.pro) is analyzed and processed by GetDP.The geometry and problem definition files usually include a data file (file_data.geo)containing parameters common to Gmsh and GetDP.Figure 2 lists some of the complete examples available online, together with references to published work where additional information about the test-cases can be found.Some of these examples can be solved in both the Helmholtz and Maxwell setting (e.g.waveguide3d or cylinder_concentric); others are only designed for the Helmholtz case (e.g.marmousi).With the default set of parameters, all these test-cases will run on standard laptop or desktop computers.But they can all also be scaled up to test the algorithms on massively parallel computers.For example, a Helmholtz problem on the marmousi example was run at frequency ω 2π = 700 (uniformly discretized by a mesh density of 20 points per the smallest wavelength), with 358 subdomains for a total of 4296 CPUs.The size of the full problem exceeded 2.3 billions unknowns.A Maxwell simulation of the waveguide3d model was performed with 3,500 subdomains on 3,500 CPUs, each with 2.6Gb of RAM, with 302,715,000 complex unknowns.For illustration purposes, Figure 3 presents some other cases that have been solved using GetDDM.Published references are provided, which contain further information about the specific test cases, mathematical models and numerical results.

Conclusion
This article introduced a new open source software, called GetDDM, for testing optimized Schwarz domain decomposition methods for time-harmonic wave problems.The software aims to provide a simple, flexible and ready-to-use environment where the weak formulations of the wave propagation problems and associated transmission conditions for the Schwarz methods are transcribed naturally, with a direct link to their symbolic mathematical representation.The code and a variety of examples for both Helmholtz and Maxwell problems are freely available online for further testing, at the address: http://onelab.info/wiki/GetDDM.Depending on the chosen parameters (frequency, mesh refinement, number of subdomains, etc.) the same files can run with a few thousands of unknowns on laptop computers or even tablets; or with billions of unknowns on massively parallel computer clusters.
In addition to the application of GetDDM to other physical problems like elastodynamics or the Schrödinger equation, ongoing work includes the development of a better interface to automatic mesh partitioning tools, as well as the investigation of techniques to handle crosspoints and cross-edges in a scalable manner.These new developments will undoubtedly result in new interesting applications, and further online examples on the GetDDM web site.

Listing 11 :
ListOfSubdomains = {}; // the subdomains that I 'm in charge of ListOfFields = {}; // my fields For idom In {0: N_DOM -1} If ( idom % MPI_Size == MPI_Rank ) Distribution of the subdomains and the Fields to the different MPI processes; the number of MPI processes can be different from the number of subdomains.(Code extracted from the file Decomposition.pro.)

Listing 16 :
Macro to update the surface fields.

Figure 1 :
Figure 1: Graphical user interface of GetDDM.(Displayed test-case is cobra, with PML transmission conditions.)
13. Preconditioners are a topic of high interest for domain decomposition methods as it is well-known that a DDM without preconditioner (one-level DDM ) is not scalable with respect to the number