Alternating direction method of multiplier for the unilateral contact problem with an automatic penalty parameter selection

We propose an alternating direction method of multiplier (ADMM) for the uni-lateral (frictionless) contact problem with an optimal parameter selection. We ﬁrst introduce an auxiliary unknown to seprate the linear elasticity subproblem from the unilateral contact condition. Then an alternating direction is applied to the corresponding augmented Lagrangian. By eliminating the primal and auxiliary unknowns, at the discrete level, we derive a pure dual algorithm, starting point for the convergence analysis and the optimal parameter approximation. Numerical experiments are proposed to illustrate the eﬃciency of the proposed (optimal) penalty parameter selection method.


Introduction
Unilateral contact (frictionless) problem is a very common problem in engineering and poses serious challenges.It differs from the classical linear elasticity only by the presence of a linear constraint (the non-penetration condition).Various numerical methods have been developed, see, e.g., [8,6,7,9,13,14] and references therein.The method proposed in [9] is an alternating direction method of multiplier (ADMM) and it is related to augmented Lagrangian operator splitting methods [2,3].The main idea is to separate the differentiable part of the problem (i.e.linear elasticity) from the nondifferentiable part (i.e. the non-penetration condition) by introducing an auxiliary unknown.Applying an alternating direction method to the corresponding augmented Lagrangian leads to a simple two-step iterative method in which the matrix of the linear elasticity is constant.This property makes ADMM algorithm outperforms, see [9], another optimization based method, the popular active-set semi-smooth Newton method [6,7,13].
Alternating direction method of multiplier is a powerful operator-splitting algorithm for solving structured optimization problems encoutoured in a wide range of areas such as compressed sensing [15], image restoration [12], machine learning [1], etc.The method was initiated by Glowinski et al. [4,2,3], who systematically developed ADMM algorithms (also known as Uzawa block relaxation methods) for solving nonlinear partial differential equations.But ADMM based algorithms have a severe drawback: the choice of the penalty

The model problem and ADMM algorithm
We consider an elastic body occupying in its initial (undeformed) configuration a bounded domain Ω of R d (d = 2, 3) with a boundary Γ = Γ D ∪Γ C .We assume that the elastic body is fixed along Γ D with meas(Γ D ) > 0. Γ C denotes a portion of Γ which is a candidate contact surface between Ω and a rigid foundation.The normalized gap between Γ C and the rigid foundation is denoted by g.In this paper, we consider the small strains hypothesis so that the strain tensor is E(u) = (∇u+∇u t )/2, where u = (u 1 (x), . . ., u d (x)) is the displacement field.Hooke's law is assumed, i.e. the stress tensor is linked to the displacement through the linear relation where C = (C ijkl ) is the (fourth order) elastic moduli tensor, assumed to be symmetric positive definite.Let n be the outward unit normal to Ω on Γ.We consider the normal component of the displacement field and the stress tensor given by The unilateral contact problem consists, for a given volume force f , of finding the displacement field u satisfying (i) the equilibrium equations (ii) and the contact (i.e.non-penetration) conditions Let us introduce the functions space •) be the symmetric, continuous and coercive bilinear form which corresponds to the virtual work in the elastic body We denote by f (•) the linear form of external forces Using the potential energy functional we can formulate the unilateral contact problem as the constrained minimization problem To achieve a solution of (2.4) by an ADMM algorithm, we need additional steps.Following Glowinski and Le Tallec [3], we introduce the set and its characteristic functional (2.5) With (2.5)-(2.6)we associate the augmented Lagrangian functional where r > 0 is the penalty parameter.ADMM algorithm for (2.4) is based on (2.7).The method performs successive minimizations in u and p followed by the multiplier λ update as follows ) (2.10) Minimization (2.8) leads to an equilibrium problem which always has a unique solution even though the original problem allows rigid body motions.Minimization (2.9) is solved explicitly.The corresponding ADMM algoririthm is presented in Algorithm 1 (see [9] for detailed derivation).We iterate until the relative error on u k and p k becomes "sufficiently" small.
Algorithm 1 ADMM algorithm for the unilateral contact problem Initialization.φ 0 and λ 0 are given.

Finite dimensional algorithms
We assume that Ω ⊂ R d is a polyhedral domain and therefore can be exactly triangulated.We consider a (continuous piecewise linear) finite element triangulation T h of Ω consistent with the decomposition of its boundary Γ into Γ D and Γ C .If n is the number of nodes of the triangulation, the dimension of the finite element subsapce V h ⊂ V is dimV h = dn.Let x j (j = 1, . . ., m) be a contact node (i.e.x j lies on Γ C ).The displacement vector and the unit outward normal at x j are denoted by u j and n j , respectively.We introduce the linear mappings N : R dn → R m , such that Nu is the vector of the normal components of u at contact nodes, that is, (Nu) j = u j n j , j = 1, . . ., m.
The finite element discretization leads to the following matrices and vectors • A, (dn) × (dn) stiffness matrix (symmetric positive definite), i.e. from the bilinear form a(u, v); • M normal mass matrices (m × m symmetric positive definite); • f ∈ R dn the (discrete) external forces; • g ∈ R m the (discrete) normalized gap; • p ∈ R m the (discrete) auxiliary unknown; • λ ∈ R m the (discrete) lagrange multiplier.
With the notations above, the augmented Lagrangian (2.7) now reads in the discrete setting where

Discrete ADMM algorithm
The derivation of the discrete version of Algorithm 1 from the augmented Lagrangian function (3.1) is straigthforward, Algorithm 2.
Algorithm 2 Algebraic ADMM algorithm for the unilateral contact problem Initialization k = 0 r > 0, p 0 and λ 0 are given.
Step 1 Compute u k+1 ∈ V h such that Step 2 Compute p k+1 ∈ P h Step 3 Update Lagrange multiplier

ADMM algorithm without the auxiliary unknown
Algorithm 2 can be simplified by eliminating the auxiliary unknown p. Indeed, using the update formula (3.3) we have , the right-hand side of (3.4) becomes e., we have eliminated the auxiliary unknown p from the right-hand side of (3.2).It remains to remove p from the multiplier update formula.From (3.3) we deduce that (3.5) Substituting (3.5) into the Lagrange multiplier update formula, we obtain the new multiplier update formula λ k+1 Gathering the results above, the ADMM algorithm without the auxiliary unknown p is described in Algorithm 3. In Algorithm 3, λ 0 + and λ 0 − must be consistent.Indeed, if, e.g., u 0 = 0, then we can set λ 0 + = r(Nu 0 − g) Algorithm 3 ADMM without auxiliary unknown for the unilateral contact problem Initialization k = 0 r > 0, λ 0 + and λ 0 − are given.
Step 1 Compute u k+1 such that Step 2 Update the Lagrange multipliers

Pure dual version
The pure dual version of the ADMM algorithm for the unilateral contact problem is obtained by eliminating the displacements vector u.From (3.6) we have where we have set Substituting u k+1 into the Lagrange multiplier update formulas, we obtain Algorithm 4.
Algorithm 4 is unpraticable since in A −1 r can be a full matrix of large size.However, the recursive relation (3.7)-(3.8) is important because it gives an easy way to prove the convergence of the ADMM algorithm.Indeed, Algorithm 4 is useful as a theoretical basis for the study of the influence of r.

Convergence
Since min(0, x) = − max(0, −x), the recursive relation (3.7)-(3.8)can be rewritten in the useful form Since for any x ∈ R n x + ≤ x we have so that where we have set The convergence of Algorithm 4 therefore depends on the spectral radius of A. We have the following proposition for the non-zeros eigenvalues of A.

Proof. Let us set
We have to show that the non-zero eigenvalues of A are eigenvalues of r N M and conversely.Let α be an eigenvalue of A and (X 1 , X 2 ) the corresponding eigenvector.We have that is, α(X 1 + X 2 ) = 0.For non-zero eigenvalues we deduce that the eigenvectors are such that X 1 = −X 2 .Substituting in (4.5)-(4.6)we obtain We deduce that α is an eigenvalue of A 1 − A 2 .From the relations above, it is obvious that non-zero eigenvalues of A 1 − A 2 are also eigenvalues of A. 2 To study the eigenvalues of I − 2rNA −1 r N M, we consider the following sequence We follow the same procedure as [3, p. 46] for a quadratic programming problem with equality constraints.From (4.7), we deduce that By introducing the auxiliary unknown η k = A −1 N Mµ k , the sequence (4.9) becomes The convergence of the sequence {η k } will gives us informations about the convergence of the sequence {µ k } and then as polynomial.Then the eigenvalues of B are polynomial functions of the eigenvalues of A −1 N MN and thus λ i (B) = p(λ i (A −1 N MN), that is The eigenvalues (and eigenvectors) of A −1 N MN are solution to the generalized symmetric eigenvalue problem N MNw = λAw Since A is symmetric positive definite and N MN is symmetric positive semi-definite, the eigenvalues of A −1 N MN are non-negative, and the eigenvectors corresponding to two distinct eigenvalues are A-otthogonal.It follows that the eigenvalues of B are such that 0 ≤ λ i (B) ≤ 1, ∀i. (4.12) Note that A and M are square and non singular matrices.Then if 0 is an eigenvalue of A −1 N MN , then the corresponding eigen-subspace is ker(N).Thus R(A −1 N M) is spanned by the eigenvectors of A −1 N MN associated with strictly positive eigenvalues (using the property R(N) = (ker(N)) ⊥ ).Since {η k } ⊂ R(A −1 N M), we can write (4.10) in a basis of R(A −1 N M) formed with eigenvectors w i of A −1 N MN associated with strictly positive eigenvalues λ i .We get We deduce the following convergence theorem.Corollary 4.3 (optimal step-size and convergence rate) The optimal penalty parameter for r is where λ m and λ M are, respectively, the smallest nonzero eigenvalue and the largest eigenvalue of A −1 N MN.With the choice (4.13), the convergence of Algorithm 4 is linear with an asymptotic constant θ satisfying We obtain the following corollary.
Corollary 4.4 If N MN is nonsingular, then the eigenvalues of A −1 N MN are strictly positive and Algorithm 4 with the optimal choice (4.13) converges linearly with an asymptotic constant θ satisfying

Penalty parameter approximation
We now study how to compute λ m and λ M , the smallest nonzero eigenvalue and the largest eigenvalue of A −1 N MN.For this purpose, it is convenient to arrange the indices in such a way that the contact indices C and the interior indices I occur in a consecutive order.This leads to the block matrix representation of the stiffness matrix as where O II , O IC and O CI are matrices of zeros; and M CC is the contact boundary (diagonal) mass matrix.Consequently, if we set then the nonzero eigenvalues of A −1 N MN are eigenvalues of ÃCC M CC .To compute ÃCC , we can apply block-Gaussian elimination to A with A II as block-pivot It follows that ÃCC is the Schur complement of A II in A.
We now detail the practical steps for the approximation of the penalty parameter.Since computing A −1 II is unpracticable even for medium size problem, we compute ÃCC by solving equivalent linear systems.We obtain Algorithm 5.
Algorithm 5 Algorithm for computing the approximate penalty parameter r * Step 1. Solve for X the system A II X = A IC (5.1) Step 2. Compute the schur complement of Step 4. Compute λ m and λ M the exterme eigenvalues of X and the optimal penalty parameter approximation (4.13).
Solving (5.1) or (5.2) is equivalent to solving several linear systems that have the same matrix but different right-hand sides.Since the different right-hand sides are available, all the systems of equations can be solved at the same time using a single Gaussian elimination (e.g. using MATLAB \ operator : X = S\M CC for (5.2)).

Numerical experiments
We have implemented the algorithms described in the previous sections in MATLAB, using piece-wise linear finite element, vectorized assembling functions and the mesh generator provided in [10,11], on a computer equipped running Linux (Ubuntu 16.04) with 3.00GHz clock frequency and 32GB RAM.The ADMM solver used in the experiments is based on Algorithm 1.The test problems used are designed in order to illustrate the behavior of the algorithms more than to model contact actual phenomena.Since our approximation procedure combines matrix inversion and computation of extreme eigenvalues, we restrict our penalty parameter estimation to small size problems.
We use the following notations • r * approximate optimal penalty parameter obtained by (4.13).
• r standard numerical optimal penalty parameter obtained using the uniform sampling of interval (.1, αE), where α > 1 and E is the Young modulus.
• m c the size of matrix ÃCC M CC used for computing r * .To reduce the computational cost m c ≤ 100 in the numerical experiments.
• Iter.The number of iterations required for convergence in Algorithm 2.
• CPU Time.CPUT time in Seconds.

Example 1: 2D problem
We consider a rectangular elastic body Ω = (0, 2) × (0, 1) with the boundary partition  We first consider the meshes of size h=1/8 (Figure 1), 1/16 and 1/32.We compare the approximate penalty parameter r * obtained with (4.13) with the numerical penalty parameter r obtained by sampling the interval (0.1, 10) * E with 200 points uniformly spaced.The evolution of the number of iterations with respect to the penalty parameter is shown in Figure 2. The normal stress distribution on Γ C is shown in Figure 3   We summarize in Table 1 the (optimal) penalty parameters and the behavior of Al-gorithm 2 in terms of CPU time and number of iterations.Note that the CPU time for r * includes the time spent in both Algorithm 2 and Algorithm 5.The CPU time for r is the computational time of the whole sampling procedure.We can notice that the sampling process gives better results in terms of the number of iterations required by the ADMM solver.But sampling is a time-consuming process.For the largest problem (h = 1/32), obtaining the optimal penalty parameter (4.13) with Algorithm 5 is about 80 times faster than the standard sampling procedure.Table 2 shows that, for r * = 3134.1289and r = 3510 (obtained using a mesh of size 1/32), the performances of Algorithm 2 are almost comparable for large size meshes. .

Example 2: Hertz problem
The Hertz problem is a classical test problem in the numerical simulation of unilateral contact problem.It consists of an infinitely long cylinder resting in a rigid foundation, and subjected to a uniform load along its top of intensity f = −(0, 1600).The radius of the cylinder is R = 8.The material constants for the cylinder are E = 2000 and ν = 0.3.For symmetry reason, only quarter of the cylinder is consider (Figure 4) and we set u 1 = 0 on Γ D = {0} × (0, 8).The contact surface is Γ C = {x ∈ (0, 8) 2 | x 2 1 + x 2 2 = 16}.We first consider meshes with 442 and 1692 with 35 and 69 nodes on Γ C , respectively.Note that the problem allows a rigid body motion in the vertical direction.In ADMM Algorithm 2, the mass terms provided by the normal integral prevent infinite displacements at the initial step.But for Algorithm 5, A II is only positive semidefinite and then non invertible.We simply add a small positive constant on the vertical components of Γ D .As in Example 1, we compute r by sampling (.1, 10) * E with 200 points, uniformly spaced.Figure 5 shows the variation of the number of iterations with respect to the penalty parameter.The deformed configuration and the stress distribution on the contact Γ C are shown in Figure 6 and 7, respectively.We summarize in Table 3 the (optimal) penalty parameters and the behavior of Algorithm 2 in terms of CPU time and number of iterations.For m c = 35, the sampling procedure gives better results in terms of the number of iterations for the ADMM solver but the computational time is much greater.For m c = 69 the number of iterations is almost equivalent for both r and r * but the computational time is again much greater for r.We report in Table 4 the performances of Algorithm 2 using r * = 201.7825and r = 200.We can notice that the performances of Algorithm 2 using both penalty parameters are almost equivalent.

Example 3: 3D problem
We now study the behavior of a 3D rectangular elastic body pressed onto a solid hemisphere, Figure 8 ( [13,9]).The elastic body occupies the domain Ω = (−0.5,0.5)×(−1, 1)× (−0.5, 0.5) and the obstacle is a half-ball with radius r = 0.5 and center (−0.3, 0, −1).The  We first consider a non uniform mesh with 198 nodes, 773 tetrahedrons and 45 nodes on Γ C .The deformed configuration obtained using ADMM Algorithm 2 is shown in Figure 9 and the normal (contact) stress distribution in Figure 10.For r, we sample (0.1 10 * E) with 200 points, uniformly spaced.We summarize in Table 5 the computed penalty parameters r * and r and the behavior of Algorithm 2 in terms of number of iterations and CPU time.Even though r * is 20% larger than r, the number of iterations required by the ADMM solver with both values is almost the same.We notice again that computing the penalty parameter with Algorithm 5 is about 100 times faster.reported in Table 6.As in the 2D case, the number of iterations is virtually independent of the mesh size, for both penalty parameters.We can notice that, for both penalty parameter, the difference of computational time for the largest problem is less than 5%.The performances of Algorithm 2 with r * and r are therefore comparable.

Proposition 4 . 1
The non-zero eigenvalues of A are eigenvalues of I − 2rNA −1 r N M and conversely.

Figure 2 :
Figure 2: Number of iterations versus the penalty parameter for Example 1.

Figure 3 :
Figure 3: Normal Stress distribution on Γ C for Example 1.

Figure 4 :
Figure 4: Mesh sample for the Hertz problem.

Figure 5 :
Figure 5: Number of iterations versus penalty parameter for the hertz problem.

Figure 6 :
Figure 6: Deformed configuration and Von Mises effective stress for the Hertz problem

Figure 8 :
Figure 8: Geometry of the 3D obstacle problem

Figure 9 :
Figure 9: Deformed configuration and Von Mises effective stress for Example 3.

Table 1 :
r obtained by sampling Versus r * obtained by (4.13).

Table 3 :
r obtained by sampling Versus r * obtained by (4.13), for the Hertz problem.

Table 4 :
Performances of Algorithm 2 with r * = 201.7825and r = 200 for the Hertz problem .

Table 5 :
r obtained by sampling Versus r * obtained by (4.13) for the 3D obstacle problem To study the beahivior of Algorithm 2, the initial mesh of 198 nodes and 773 tetrahedrons is successively refined to produce meshes with 1289, 9245 and 69897 nodes and 6144, 49472 and 395776 tetrahedrons, respectively.The performances of Algorithm 2 areFigure 10: Normal stress distribution on Γ C Number of nodes on Ω/Γ C 198/45 1289/153 9245/603 69897/2341

Table 6 :
Performances of Algorithm 2 with r and r * for the 3D obstacle problem.