Reliable Real-Time Solution of Parametrized Partial Differential Equations: Reduced-Basis Output Bound Methods

We present a technique for the rapid and reliable prediction of linear-functional outputs of elliptic (and parabolic) partial differential equations with affine parameter dependence. The essential components are (i) (provably) rapidly convergent global reduced basis approximations, Galerkin projection onto a space W(sub N) spanned by solutions of the governing partial differential equation at N selected points in parameter space; (ii) a posteriori error estimation, relaxations of the error-residual equation that provide inexpensive yet sharp and rigorous bounds for the error in the outputs of interest; and (iii) off-line/on-line computational procedures, methods which decouple the generation and projection stages of the approximation process. The operation count for the on-line stage, in which, given a new parameter value, we calculate the output of interest and associated error bound, depends only on N (typically very small) and the parametric complexity of the problem; the method is thus ideally suited for the repeated and rapid evaluations required in the context of parameter estimation, design, optimization, and real-time control.


Introduction
The optimization, control, and characterization of an engineering component or system requires the prediction of certain ''quantities of interest,'' or performance metrics, which we shall denote outputs-for example deflections, maximum stresses, maximum temperatures, heat transfer rates, flowrates, or lift and drags. These outputs are typically expressed as functionals of field variables associated with a parametrized partial differential equation which describes the physical behavior of the component or system. The parameters, which we shall denote inputs, serve to identify a particular ''configuration''of the component: these inputs may represent design or decision variables, such as geometry-for example, in optimization studies; control variables, such as actuator power-for example in real-time applications; or characterization variables, such as physical properties-for example in inverse problems. We thus arrive at an implicit input-output relationship, evaluation of which demands solution of the underlying partial differential equation.
Our goal is the development of computational methods that permit rapid and reliable evaluation of this partial-differentialequation-induced input-output relationship in the limit of many queries-that is, in the design, optimization, control, and characterization contexts. The ''many queries'' limit has certainly received considerable attention: from ''fast loads'' or multiple righthand side notions ͑e.g., Chan and Wan ͓1͔,Farhat et al. ͓2͔͒ to matrix perturbation theories ͑e.g., Akgun et al. ͓3͔,Yip ͓4͔͒ to continuation methods ͑e.g., Allgower and Georg ͓5͔, Rheinboldt ͓6͔͒. Our particular approach is based on the reduced-basis method, first introduced in the late 1970s for nonlinear structural analysis ͑Almroth et al. ͓7͔,Noor and Peters ͓8͔͒, and subsequently developed more broadly in the 1980s and 1990s ͑Balmes ͓9͔, Barrett and Reddien ͓10͔,Fink and Rheinboldt ͓11͔,Peterson ͓12͔,Porsching ͓13͔,Rheinboldt ͓14͔͒. The reduced-basis method recognizes that the field variable is not, in fact, some arbitrary member of the infinite-dimensional solution space associated with the partial differential equation; rather, it resides, or ''evolves,'' on a much lower-dimensional manifold induced by the parametric dependence.
The reduced-basis approach as earlier articulated is local in parameter space in both practice and theory. To wit, Lagrangian or Taylor approximation spaces for the low-dimensional manifold are typically defined relative to a particular parameter point; and the associated a priori convergence theory relies on asymptotic arguments in sufficiently small neighborhoods ͑Fink and Rheinboldt ͓11͔͒. As a result, the computational improvements-relative to conventional ͑say͒ finite element approximation-are often quite modest ͑Porsching ͓13͔͒. Our work differs from these earlier efforts in several important ways: first, we develop ͑in some cases, provably͒ global approximation spaces; second, we introduce rigorous a posteriori error estimators; and third, we exploit off-line/ on-line computational decompositions ͑see Balmes ͓9͔ for an earlier application of this strategy within the reduced-basis context͒. These three ingredients allow us, for the restricted but important class of ''parameter-affine'' problems, to reliably decouple the generation and projection stages of reduced-basis approximation, thereby effecting computational economies of several orders of magnitude.
In this expository review paper we focus on these new ingredients. In Section 2 we introduce an abstract problem formulation and several illustrative instantiations. In Section 3 we describe, for coercive symmetric problems and ''compliant'' outputs, the reduced-basis approximation; and in Section 4 we present the associated a posteriori error estimation procedures. In Section 5 we consider the extension of our approach to noncompliant outputs and nonsymmetric operators; eigenvalue problems; and, more briefly, noncoercive operators, parabolic equations, and nonaffine problems. A description of the system architecture in which these numerical objects reside may be found in Veroy et al. ͓15͔. 2 Problem Statement 2.1 Abstract Formulation. We consider a suitably regular domain ⍀ʚR d , dϭ1, 2, or 3, and associated function space XʚH 1 (⍀), where H 1 (⍀)ϭ͕vL 2 (⍀), ٌv(L 2 (⍀)) d ͖, and L 2 (⍀) is the space of square-X integrable functions over ⍀. The inner product and norm associated with X are given by (•,•) X and ʈ•ʈ X ϭ(•,•) X 1/2 , respectively. We also define a parameter set D R P , a particular point in which will be denoted . Note that ⍀ does not depend on the parameter.
We then introduce a bilinear form a: XϫXϫD→R, and linear forms f: X→R, l : X→R. We shall assume that a is continuous, a(w,v;)р␥()ʈwʈ X ʈvʈ X р␥ 0 ʈwʈ X ʈvʈ X , ᭙D; furthermore, in Sections 3 and 4, we assume that a is coercive, and symmetric, a(w,v;)ϭa(v,w;); ᭙w,vX, ᭙D. We also require that our linear forms f and l be bounded; in Sections 3 and 4 we additionally assume a ''compliant'' output, We shall also make certain assumptions on the parametric dependence of a, f, and l . In particular, we shall suppose that, for some finite ͑preferably small͒ integer Q, a may be expressed as for some q : D→R and a q : XϫX→R, qϭ1, . . . ,Q. This ''separability,'' or ''affine,'' assumption on the parameter dependence is crucial to computational efficiency; however, certain relaxations are possible-see Section 5.3.3. For simplicity of exposition, we assume that f and l do not depend on ; in actual practice, affine dependence is readily admitted. Our abstract problem statement is then: for any D, find s()R given by where u()X is the solution of a͑u͑ ͒,v; ͒ϭ f ͑v ͒, ᭙vX.
In the language of the Introduction, a is our partial differential equation ͑in weak form͒, is our parameter, u() is our field variable, and s() is our output. For simplicity of exposition, we may on occasion suppress the explicit dependence on .

Particular Instantiations.
We indicate here a few instantiations of the abstract formulation; these will serve to illustrate the methods ͑for coercive, symmetric problems͒ of Sections 3 and 4.

A Thermal Fin.
In this example, we consider the twoand three-dimensional thermal fins shown in Fig. 1; these examples may be ͑interactively͒ accessed on our web site. 1 The fins consist of a vertical central ''post'' of conductivity k 0 and four horizontal ''subfins'' of conductivity k i , iϭ1, . . . ,4. The fins conduct heat from a prescribed uniform flux source q Љ at the root ⌫ root through the post and large-surface-area subfins to the surrounding flowing air; the latter is characterized by a sink temperature ũ 0 and prescribed heat transfer coefficient h . The physical model is simple conduction: the temperature field in the fin, ũ , satisfies where ⍀ i is that part of the domain with conductivity k i , and ‫ץ‬⍀ denotes the boundary of ⍀ . We now ͑i͒ nondimensionalize the weak equations ͑5͒, and ͑ii͒ apply a continuous piecewise-affine transformation from ⍀ to a fixed ͑-independent͒ reference domain ⍀ ͑Maday et al. ͓16͔͒. The abstract problem statement ͑4͒ is then recovered for ϭ͕k 1 ,k 2 ,k 3 ,k 4 ,Bi,L,t͖, Dϭ͓0.1,10.0͔ 4 ϫ͓0.01,1.0͔ϫ͓2.0,3.0͔ ϫ͓0.1,0.5͔, and Pϭ7; here k 1 , . . . ,k 4 are the thermal conductivities of the ''subfins'' ͑see Fig. 1͒ relative to the thermal conductivity of the fin base; Bi is a nondimensional form of the heat transfer coefficient; and, L, t are the length and thickness of each of the ''subfins'' relative to the length of the fin root ⌫ root . It is readily verified that a is continuous, coercive, and symmetric; and that the ''affine'' assumption ͑2͒ obtains for Qϭ16 ͑two-1 FIN2D: http://augustine.mit.edu/fin2d/fin2d.pdf and FIN3D: http:// augustine.mit.edu/fin3d -1/fin3d -1.pdf

Fig. 1 Two-and three-dimensional thermal fins
dimensional case͒ and Qϭ25 ͑three-dimensional case͒. Note that the geometric variations are reflected, via the mapping, in the q (). For our output of interest, s(), we consider the average temperature of the root of the fin nondimensionalized relative to q Љ, k 0 , and the length of the fin root. This output may be expressed as It is readily shown that this output functional is bounded and also ''compliant'': l (v) ϭ f (v), ᭙vX.

A Truss Structure.
We consider a prismatic microtruss structure ͑Evans et al. ͓17͔,Wicks and Hutchinson ͓18͔͒ shown in Fig. 2; this example may be ͑interactively͒ accessed on our web site. 2 The truss consists of a frame ͑upper and lower faces, in dark gray͒ and a core ͑trusses and middle sheet, in light gray͒. The structure transmits a force per unit depth F uniformly distributed over the tip of the middle sheet ⌫ 3 through the truss system to the fixed left wall ⌫ 0 . The physical model is simple plane-strain ͑two-dimensional͒ linear elasticity: the displacement field u i , iϭ1,2, satisfies where ⍀ is the truss domain, Ẽ i jkl is the elasticity tensor, and X refers to the set of functions in H 1 (⍀ ) which vanish on ⌫ 0 . We assume summation over repeated indices. We now ͑i͒ nondimensionalize the weak equations ͑6͒, and ͑ii͒ apply a continuous piecewise-affine transformation from ⍀ to a fixed ͑-independent͒ reference domain ⍀. The abstract problem statement ͑4͒ is then recovered for ϭ͕t f ,t t ,H,͖, Dϭ͓0.08,1.0͔ ϫ͓0.2,2.0͔ϫ͓4.0,10.0͔ϫ͓30.0°,60.0°͔, and Pϭ4. Here t f and t t are the thicknesses of the frame and trusses ͑normalized relative to t c ͒, respectively; H is the total height of the microtruss ͑normalized relative to t c ͒; and is the angle between the trusses and the faces. The Poisson's ratio, ϭ0.3, and the frame and core Young's moduli, E f ϭ75 GPa and E c ϭ200 GPa, respectively, are held fixed. It is readily verified that a is continuous, coercive, and symmetric; and that the ''affine'' assumption ͑2͒ obtains for Q ϭ44.
Our outputs of interest are ͑i͒ the average downward deflection ͑compliance͒ at the core tip, ⌫ 3 , nondimensionalized by F /Ẽ f ; and ͑ii͒ the average normal stress across the critical ͑yield͒ section denoted ⌫ 1 s in Fig. 2. These compliance and noncompliance outputs can be expressed as s 1 ()ϭl 1 (u()) and s 2 () ϭl 2 (u()), respectively, where l 1 (v)ϭϪ͐ ⌫ 3 v 2 , and are bounded linear functionals; here i is any suitably smooth function in H 1 (⍀ s ) such that i n i ϭ1 on ⌫ 1 s and i n i ϭ0 on ⌫ 2 s , where n is the unit normal. Note that s 1 () is a compliant output, whereas s 2 () is ''noncompliant.''

Reduced-Basis Approach
We recall that in this section, as well as in Section 4, we assume that a is continuous, coercive, symmetric, and affine in -see ͑2͒; and that l (v)ϭ f (v), which we denote ''compliance.''

Reduced-Basis Approximation.
We first introduce a sample in parameter space, S N ϭ͕ 1 , . . . , N ͖, where i D, i ϭ1, . . . , N; see Section 3.2.2 for a brief discussion of point distribution. We then define our Lagrangian ͑Porsching ͓13͔͒ reduced-basis approximation space as W N ϭspan ͕ n ϵu( n ),n ϭ1, . . . ,N͖, where u( n )X is the solution to ͑4͒ for ϭ n . In actual practice, u( n ) is replaced by an appropriate finite element approximation on a suitably fine truth mesh; we shall discuss the associated computational implications in Section 3.3. Our reduced-basis approximation is then: for any D, find s N () Non-Galerkin projections are briefly described in Section 5.3.1.

A Priori Convergence Theory.
3.2.1 Optimality. We consider here the convergence rate of u N ()→u() and s N ()→s() as N→ϱ. To begin, it is standard to demonstrate optimality of u N () in the sense that ͑We note that, in the coercive case, stability of our ͑''conforming''͒ discrete approximation is not an issue; the noncoercive case is decidedly more delicate ͑see Section 5.3.1͒.͒ Furthermore, for our compliance output, from symmetry and Galerkin orthogonality. It follows that s() Ϫs N () converges as the square of the error in the best approximation and, from coercivity, that s N () is a lower bound for s().

Best Approximation.
It now remains to bound the dependence of the error in the best approximation as a function of N. At present, the theory is restricted to the case in which Pϭ1, D ϭ͓0, max ͔, and where a 0 is continuous, coercive, and symmetric, and a 1 is continuous, positive semi-definite (a 1 (w,w)у0, ᭙wX), and symmetric. This model problem ͑10͒ is rather broadly relevant, for example to variable orthotropic conductivity, variable rectilinear geometry, variable piecewise-constant conductivity, and variable Robin boundary conditions. We now suppose that the n , nϭ1, . . . , N, are logarithmically distributed in the sense that where is an upper bound for the maximum eigenvalue of a 1 relative to a 0 . ͑Note is perforce bounded thanks to our assumption of continuity and coercivity; the possibility of a continuous spectrum does not, in practice, pose any problems.͒ We can then prove ͑Maday et al. ͓19͔͒ that, for NϾN crit ϵe ln( max ϩ1), We observe exponential convergence, uniformly ͑globally͒ for all in D, with only very weak ͑logarithmic͒ dependence on the range of the parameter ( max ). ͑Note the constants in ͑12͒ are for the particular case in which (•,•) X ϭa 0 (•,•).͒ The proof exploits a parameter-space ͑nonpolynomial͒ interpolant as a surrogate for the Galerkin approximation. As a result, the bound is not always ''sharp:'' we observe many cases in which the Galerkin projection is considerably better than the associated interpolant; optimality ͑8͒ chooses to ''illuminate'' only certain points n , automatically selecting a best ''subapproximation'' among all ͑combinatorially many͒ possibilities. We thus see why reduced-basis state-space approximation of s() via u() is preferred to simple parameter-space interpolation of s() ͑''connecting the dots''͒ via ( n ,s( n )) pairs. We note, however, that the logarithmic point distribution ͑11͒ implicated by our interpolant-based arguments is not simply an artifact of the proof: in numerous numerical tests, the logarithmic distribution performs considerably ͑and in many cases, provably͒ better than other more obvious candidates, in particular for large ranges of the parameter. Fortunately, the convergence rate is not too sensitive to point selection: the theory only requires a log ''on the average'' distribution ͑Maday et al. ͓19͔͒; and, in practice, need not be a sharp upper bound.
The result ͑12͒ is certainly tied to the particular form ͑10͒ and associated regularity of u(). However, we do observe similar exponential behavior for more general operators; and, most importantly, the exponential convergence rate degrades only very slowly with increasing parameter dimension, P. We present in Table 1 the error ͉s()Ϫs N ()͉/s() as a function of N, at a particular representative point in D, for the two-dimensional thermal fin problem of Section 2.2.1; we present similar data in Table 2 for the truss problem of Section 2.2.2. In both cases, since tensor-product grids are prohibitively profligate as P increases, the n are chosen ''log-randomly'' over D: we sample from a multi-variate uniform probability density on log(). We observe that, for both the thermal fin ( Pϭ7) and truss ( Pϭ4) problems, the error is remarkably small even for very small N; and that, in both cases, very rapid convergence obtains as N→ϱ. We do not yet have any theory for PϾ1. But certainly the Galerkin optimality plays a central role, automatically selecting ''appropriate'' scattered-data subsets of S N and associated ''good'' weights so as to mitigate the curse of dimensionality as P increases; and the log-random point distribution is also important-for example, for the truss problem of Table 2, a non-logarithmic uniform random point distribution for S N yields errors which are larger by factors of 20 and 10 for Nϭ30 and 80, respectively.

Computational Procedure.
The theoretical and empirical results of Sections 3.1 and 3.2 suggest that N may, indeed, be chosen very small. We now develop off-line/on-line computational procedures that exploit this dimension reduction.
We first express u N () as where u គ N ()R N ; we then choose for test functions vϭ i , i ϭ1, . . . , N. Inserting these representations into ͑7͒ yields the desired algebraic equations for u គ N ()R N , in terms of which the output can then be evaluated as We now invoke ͑2͒ to write or The off-line/on-line decomposition is now clear. In the off-line stage, we compute the u( n ) and form the A គ N q and F គ N : this requires N ͑expensive͒ ''a'' finite element solutions and O(QN 2 ) finite-element-vector inner products. In the on-line stage, for any given new , we first form A គ N from ͑15͒, then solve ͑14͒ for u គ N (), and finally evaluate s N ()ϭF គ N T u គ N (): this requires O(QN 2 )ϩO(2/3 N 3 ) operations and O(QN 2 ) storage.
Thus, as required, the incremental, or marginal, cost to evaluate s N () for any given new -as proposed in a design, optimization, or inverse-problem context-is very small: first, because N is very small, typically O(10)-thanks to the good convergence properties of W N ; and second, because ͑14͒ can be very rapidly Table 1 Error, error bound "Method I…, and effectivity as a function of N, at a particular representative point «D, for the two-dimensional thermal fin problem "compliant output…  assembled and inverted-thanks to the off-line/on-line decomposition ͑see Balmes ͓9͔ for an earlier application of this strategy within the reduced-basis context͒. For the problems discussed in this paper, the resulting computational savings relative to standard ͑well-designed͒ finite-element approaches are significant, at least O(10), typically O(100), and often O(1000) or more.

A Posteriori Error Estimation: Output Bounds
From Section 3 we know that, in theory, we can obtain s N () very inexpensively: the on-line computational effort scales as O(2/3 N 3 )ϩO(QN 2 ); and N can, in theory, be chosen quite small. However, in practice, we do not know how small N can be chosen: this will depend on the desired accuracy, the selected output͑s͒ of interest, and the particular problem in question; in some cases Nϭ5 may suffice, while in other cases, Nϭ100 may still be insufficient. In the face of this uncertainty, either too many or too few basis functions will be retained: the former results in computational inefficiency; the latter in unacceptable uncertainty---particularly egregious in the decision contexts in which reduced-basis methods typically serve. We thus need a posteriori error estimators for s N . Surprisingly, a posteriori error estimation has received relatively little attention within the reduced-basis frame-work ͑Noor and Peters ͓8͔͒, even though reduced-basis methods are particularly in need of accuracy assessment: the spaces are ad hoc and pre-asymptotic, thus admitting relatively little intuition, ''rules of thumb,'' or standard approximation notions.
Recall that, in this section, we continue to assume that a is coercive and symmetric, and that l is ''compliant.'' 4.1 Method I. The approach described in this section is a particular instance of a general ''variational'' framework for a posteriori error estimation of outputs of interest. However, the reduced-basis instantiation described here differs significantly from earlier applications to finite element discretization error ͑Maday et al. ͓20͔,Machiels et al. ͓21͔͒ and iterative solution error ͑Patera and Rønquist ͓22͔͒ both in the choice of ͑energy͒ relaxation and in the associated computational artifice.

Formulation.
We assume that we are given a positive function g():D→R ϩ , and a continuous, coercive, symmetric ͑-independent͒ bilinear from â :XϫX→R, such that for some positive real constant ␣ គ 0 . We then find ê ()X such that where for a given wX, R(v;w;)ϭl (v)Ϫa(w,v;) is the weak form of the residual. Our lower and upper output estimators are then evaluated as is the estimator gap.

Properties.
We shall prove in this section that s N Ϫ () рs()рs N ϩ (), and hence that ͉s()Ϫs N ()͉ϭs()Ϫs N () р⌬ N (). Our lower and upper output estimators are thus lower and upper output bounds; and our output estimator gap is thus an output bound gap-a rigorous bound for the error in the output of interest. It is also critical that ⌬ N () be a relatively sharp bound for the true error: a poor ͑overly large͒ bound will encourage us to refine an approximation which is, in fact, already adequate-with a corresponding ͑unnecessary͒ increase in off-line and on-line computational effort. We shall prove in this section that ⌬ N () р(␥ 0 /␣ គ 0 )(s()Ϫs N ()), where ␥ 0 and ␣ គ 0 are the N-independent a-continuity and g()â -coercivity constants defined earlier. Our two results of this section can thus be summarized as where is the effectivity, and Const is a constant independent of N. We shall denote the left ͑bounding property͒ and right ͑sharpness property͒ inequalities of ͑20͒ as the lower effectivity and upper effectivity inequalities, respectively. We first prove the lower effectivity inequality ͑bounding prop-erty͒: s N Ϫ ()рs()рs N ϩ (), ᭙D, for s N Ϫ () and s N ϩ () defined in ͑18͒. The lower bound property follows directly from Section 3.2.1. To prove the upper bound property, we first observe that R(v;u N since g()â (ê ()Ϫe(),ê ()Ϫe())у0 and a(e(),e(); )Ϫg()â (e),e())у0 from ͑16͒. Invoking ͑9͒ and ͑22͒, we then obtain s()Ϫs N ()ϭa(e(),e();)рg()â (ê (), ê ()); and thus s()рs N ()ϩg()â (ê (),ê ())ϵs N ϩ (), as desired.
The off-line/on-line decomposition should now be clear. In the off-line stage we compute ẑ 0 and ẑ j q , jϭ1, . . . ,N, qϭ1, . . . ,Q, and then form c 0 , ⌳ j q , and ⌫ j j Ј qqЈ : this requires QNϩ1 ͑expen-sive͒ ''â '' finite element solutions, and O(Q 2 N 2 ) finite-elementvector inner products. In the on-line stage, for any given new , we evaluate s N ϩ as expressed in ͑25͒: this requires O(Q 2 N 2 ) operations and O(Q 2 N 2 ) storage ͑for c 0 , ⌳ j q , and ⌫ j j Ј qqЈ ͒. As for the computation of s N (), the marginal cost for the computation of s N Ϯ () for any given new is quite small-in particular, it is independent of the dimension of the truth finite element approximation space X.
There are a variety of ways in which the off-line/on-line decomposition and output error bounds can be exploited. A particularly attractive mode incorporates the error bounds into an on-line adaptive process, in which we successively approximate s N () on a sequence of approximation spaces W N j Ј ʚW N , N j ЈϭN 0 2 j ---for example, W N j Ј may contain the N j Ј samples points of S N closest to the new of interest---until ⌬ N j Ј is less than a specified error tolerance. This procedure both minimizes the on-line computational effort and reduces conditioning problems-while simultaneously ensuring accuracy and certainty.
The essential advantage of the approach described in this section is the guarantee of rigorous bounds. There are, however, certain disadvantages. The first set of disadvantages relates to the choice of g() and â . In many cases, simple inspection suffices: for example, in our thermal fin problem of Section 2.2.1, g() ϭmin qϭ1, . . . ,Q q () and â (w,v)ϭ͚ qϭ1 Q a q (w,v) yields the very good effectivities summarized in Table 1. In other cases, however, there is no self-evident ͑or readily computed, Maday et al. ͓23͔͒ good choice: for example, for the truss problem of Section 2.2.2, the existence of almost-pure rotations renders g() very small relative to ␥͑͒, with corresponding detriment to N (). The second set of disadvantages relates to the computational expensethe O(Q) off-line and the O(Q 2 ) on-line scaling induced by ͑24͒ and ͑25͒, respectively. Both of these disadvantages are eliminated in the ''Method II'' to be discussed in the next section; however ''Method II'' only provides asymptotic bounds as N→ϱ. The choice thus depends on the relative importance of absolute certainty and computational efficiency.

Method II.
As already indicated, Method I has certain limitations; we discuss here a Method II which addresses these limitations, albeit at the loss of complete certainty.
where ⌬ N,M (), the estimator bound gap, is given by for some (0,1). The effectivity of the approximation is defined as For our purposes here, we shall consider M ϭ2N.

Properties.
As for Method I, we would like to prove the effectivity inequality 1р N,2N ()рConst, ᭙N. However, we will only be able to demonstrate an asymptotic form of this inequality. Furthermore, the latter shall require, and we shall make, the hypothesis that We note that the assumption ͑29͒ is certainly plausible: if our a priori bound of ͑12͒ in fact reflects asymptotic behavior, then s()Ϫs N ()ϳc 1 e Ϫc 2 N , s()Ϫs 2N ()ϳc 1 e Ϫ2c 2 N , and hence N,2N ()ϳe Ϫc 2 N , as desired. We first prove the lower effectivity inequality ͑bounding prop-erty͒: s N,2N Ϫ ()рs()рs N,2N ϩ (), as N→ϱ. To demonstrate the lower bound we again appeal to ͑9͒ and the coercivity of a; indeed, this result ͑still͒ obtains for all N. To demonstrate the upper bound, we write We now recall that s()Ϫs N ()у0, and that 0ϽϽ1-that is, 1/Ͼ1; it then follows from ͑31͒ and our hypothesis ͑29͒ that there exists a finite N* such that This concludes the proof: we obtain asymptotic bounds. We now prove the upper effectivity inequality ͑sharpness prop-erty͒. From the definitions of N,2N (), ⌬ N,2N () and N,2N (), we directly obtain It is readily shown that N,2N () is bounded from above by 1/ for all N: we know from ͑9͒ that N,2N () is strictly non-negative.
It can also readily be shown that N,2N () is non-negative: since W N ʚW 2N , it follows from ͑8͒ ͑for (•,•) X ϭa(•,•;)͒ and ͑9͒ that s()уs 2N ()уs N (), and hence N,2N ()р1. We thus conclude that 0р N,2N ()р1/ for all N. Furthermore, from our hypothesis on N,2N (), ͑29͒, we know that N,2N () will tend to 1/ as N increases. The essential approximation enabler is exponential convergence: we obtain bounds even for rather small N and relatively large . We thus achieve both ''near'' certainty and good effectivities. We demonstrate this claim in Table 2, in which we present the bound gap and effectivity for our truss example of Section 2.2.2; the results tabulated correspond to the choice ϭ1/2. We clearly obtain bounds for all N; and we observe that N,2N () does, indeed, rather quickly approach 1/. O(4QN 2 ) storage. The on-line effort for this Method II predictor/ error estimator procedure ͑based on s N () and s 2N ()͒ will thus require eightfold more operations than the ''predictor-only'' procedure of Section 3.

Computational
Method II is in some sense very naïve: we simply replace the true output s() with a finer-approximation surrogate s 2N (). ͑There are more obscure ways to describe the method-in terms of a reduced-basis approximation for the error-however, there is little to be gained from these alternative interpretations.͒ The essential computation enabler is again exponential convergence, which permits us to choose M ϭ2N-hence controlling the additional computational effort attributable to error estimation-while simultaneously ensuring that N,2N () tends rapidly to zero. Exponential convergence also ensures that the cost to compute both s N () and s 2N () is ''negligible.'' In actual practice, since s 2N () is available, we can of course take s 2N (), rather than s N (), as our output prediction; this greatly improves not only accuracy, but also certainty-⌬ N,2N () is almost surely a bound for s()-s 2N (), albeit an exponentially conservative bound as N tends to infinity.

Noncompliant Outputs and Nonsymmetric Operators.
In Sections 3 and 4 we formulate the reduced-basis method and associated error estimation procedure for the case of compliant outputs, l (v)ϭ f (v), ᭙vX. We briefly summarize here the formulation and theory for more general linear bounded output functionals; moreover, the assumption of symmetry ͑but not yet coer-civity͒ is relaxed, permitting treatment of a wider class of problems---a representative example is the convection-diffusion equation, in which the presence of the convective term renders the operator nonsymmetric. We first present the reduced-basis approximation, now involving a dual or adjoint problem; we then formulate the associated a posteriori error estimators; and we conclude with a few illustrative results.
As a preliminary, we first generalize the abstract formulation of Section 2.1. As before, we define the ''primal'' problem as in ͑4͒, however we of course no longer require symmetry. But we also introduce an associated adjoint or ''dual'' problem: for any X, find ()X such that a͑v,͑ ͒; ͒ϭϪl ͑v ͒, ᭙vX; recall that l (v) is our output functional.
For any D, our reduced basis approximation is then obtained by standard Galerkin projection onto W N ͑though for highly nonsymmetric operators minimum residual and Petrov-Galerkin projections are attractive-stabler-alternatives͒. To wit, for the primal problem, we find u N ()W N such that a(u N (),v;) ϭ f (v), ᭙vW N ; and for the adjoint problem, we define ͑though, unless otherwise indicated, do not compute͒ N () W N such that a(v, N ();)ϭϪl (v), ᭙vW N . The reduced-basis output approximation is then calculated from s N ()ϭl (u N ()).
Turning now to the a priori theory, it follows from standard arguments that u N () and N () are ''optimal'' in the sense that The best approximation analysis is then similar to that presented in Section 3.2. As regards our output, we now have from Galerkin orthogonality, the definition of the primal and the adjoint problems, and the Cauchy-Schwartz inequality. We now understand why we include the ( n ) in W N : to ensure that ʈ()Ϫ N ()ʈ X is small. We thus recover the ''square'' effect in the convergence rate of the output, albeit ͑and unlike the symmetric case͒ at the expense of some additional computational effortthe inclusion of the ( n ) in W N ; typically, even for the very rapidly convergent reduced-basis approximation, the ''fixed errorminimum cost'' criterion favors the adjoint enrichment. For simplicity of exposition ͑and to a certain extent, implemen-tation͒,we present here the ''integrated'' primal-dual approximation space. However, there are significant computational and conditioning advantages associated with a ''nonintegrated'' approach, in which we introduce separate primal (u( n )) and dual (( n )) approximation spaces for u() and ͑͒, respectively. Note in the ''nonintegrated'' case we are obliged to compute N (), since to preserve the output error ''square effect'' we must modify our predictor with a residual correction, f ( N ()) Ϫa(u N (), N ();) ͑Maday et al. ͓23͔͒. Both the ''integrated'' and ''nonintegrated'' approaches admit an off-line/on-line decomposition similar to that described in Section 3.3 for the compliant, symmetric problem; as before, the on-line complexity and storage are independent of the dimension of the very fine ͑''truth''͒ finite element approximation.

Method I A Posteriori Error Estimators.
We extend here the method developed in Section 4.1.2 to the more general case of noncompliant and nonsymmetric problems. We begin with the formulation.
Finally, turning to computational issues, we note that the offline/on-line decomposition described in Section 4.1 for compliant symmetric problems directly extends to the noncompliant, nonsymmetric case-except that we must compute the norm of both the primal and dual ''modified errors,'' with a concomitant doubling of computational effort.

Method II A Posteriori Error Estimators.
We discuss here the extension of Method II of Section 4.2 to noncompliant outputs and nonsymmetric operators.
To begin, we set M ϾN, M even, and introduce a parameter for (0,1). The effectivity of the approximation is defined as We shall again only consider M ϭ2N. As in Section 4.2, we would like to prove that 1р N,2N () рConst for sufficiently large N; and, as in Section 4.2, we must again make the hypothesis ͑29͒. We first consider the lower effectivity inequality ͑bounding property͒, and prove that In particular, simple algebraic manipulations yield The desired result then directly follows from our hypothesis on N,2N , ͑29͒, and the range of . The proof for the upper effectivity inequality ͑sharpness prop-erty͒ parallels the derivation of Section 4.2.2. In particular, we write from our hypothesis ͑29͒ we may thus conclude that N,2N () →1/ as N→ϱ. Note in the noncompliant, nonsymmetric case we can make no stronger statement. We demonstrate our effectivity claims in Table 3, in which we present the error, bound gap, and effectivity for the noncompliant output ͑s 2 (), average stress͒ of the truss example of Section 2.2.2; the results tabulated correspond to the choice ϭ1/2. We clearly obtain bounds for all N; and the effectivity rather quickly approaches 1/ ͑for Nу120, N,2N () remains fixed at 1/ϭ2.0͒.

Eigenvalue Problems.
We next consider the extension of our approach to symmetric positive definite eigenvalue problems. The eigenvalues of appropriately defined partial-differentialequation eigenproblems convey critical information about a physical system: in linear elasticity, the critical buckling load; in dynamic analysis of structures, the resonant modes; in conduction heat transfer, the equilibrium timescales. Solution of large-scale eigenvalue problems is computationally intensive: the reducedbasis method is thus very attractive.
Here a is the continuous, coercive, symmetric form introduced earlier, and m is ͑say͒ the L 2 inner product over ⍀. The assumptions on a and m imply the eigenvalues i () will be real and positive. We order the eigenvalues ͑and corresponding eigenfunctions u i ͒ such that 0Ͻ 1 ()Ͻ 2 ()р . . . ; we shall assume that 1 () and 2 () are distinct. We suppose that our output of interest is the minimum eigenvalue, s͑ ͒ϭ 1 ͑ ͒; other outputs may also be considered.
Following ͑Machiels et al. ͓24͔͒, we present here a reducedbasis predictor and a Method I error estimator for symmetric positive-definite eigenvalue problems; we also briefly describe the simpler Method II approach.
The formulation admits an on-line/off-line decomposition ͑Machiels et al. ͓24͔͒ very similar to the approach described for equilibrium problems in Section 3.

Method I A Posteriori Error Estimators.
As before, we assume that we are given a positive function g():D→R ϩ and a continuous, coercive, symmetric bilinear form â (w,v):XϫX →R, that satisfy the inequality ͑16͒. We then find ê ()X such that in which the right-hand side is the eigenproblem equivalent of the residual. We then evaluate our estimators as where ␦()ϭ1Ϫ N1 ()/ N2 () and ͑0,1͒. The effectivity is defined as N ()ϭ⌬ N ()/( N1 ()Ϫ 1 ()).
We now consider the lower and upper effectivity inequalities. As regards the lower effectivity inequality ͑bounding property͒, we of course obtain s N ϩ ()у 1 (), ᭙N. The difficult result is the lower bound: it can be proven ͑Machiels et al. ͓24͔͒ that there exists an N*(S N/2 ,) such that s N Ϫ ()р 1 (), ᭙NϾN*. In practice, N*ϭ1, due to the good ͑theoretically motivated͒ choice for ␦͑͒; there is thus very little uncertainty in our ͑asymptotic͒ bounds. We also prove in Machiels et al. ͓24͔ a result related to the upper effectivity inequality ͑sharpness property͒; in, practice, very good effectivities are obtained. To demonstrate these claims we consider the eigenvalue problem associated with ͑the homogeneous version͒ of our two-dimensional thermal fin example of Section 2.2.1. We present in Table 4 the error, error  bound, and effectivity as a function of N at a particular point D. We observe rapid convergence, bounds for all N, and good effectivities. Finally, we note that our output estimator admits an off-line/online decomposition similar to that for equilibrium problems; the additional terms in ͑56͒ are readily treated through our affine expansion/linear superposition procedure.

Further Generalizations.
In this section we briefly describe several additional extensions of the methodology. In each case we focus on the essential new ingredient; further details ͑in most cases͒ may be found in the referenced literature.

Noncoercive Linear Operators.
The archetypical noncoercive linear equation is the Helmholtz, or reduced-wave, equation; many ͑e.g., inverse scattering͒ applications of this equation arise, for example, in acoustics and electromagnetics. The essential new mathematical ingredient is the loss of coercivity of a. In particular, well-posedness is now ensured only by the inf-sup condition: there exists positive ␤ 0 , ␤͑͒, such that Two numerical difficulties arise due to this ''weaker'' stability condition. The first difficulty is preservation of the inf-sup stability condition for finite dimensional approximation spaces. To wit, although in the coercive case restriction to the space W N actually increases stability, in the noncoercive case restriction to the space W N can easily decrease stability: the relevant supremizers may not be adequately represented. Loss of stability can, in turn, lead to poor approximations-the inf-sup parameter enters in the denominator of the a priori convergence result. The second numerical difficulty is estimation of the inf-sup parameter, which for noncoercive problems plays the role of g() in Method I a posteriori error estimation techniques. In particular, ␤͑͒ can not typically be deduced analytically, and thus must be evaluated ͑via an eigenvalue formulation͒ as part of the reduced-basis approximation. Our resolution of both these difficulties involves two elements ͑Maday et al. ͓23͔͒: first, we consider projections other than standard Galerkin; and second, we consider ''enriched'' approximation spaces.
In one approach ͑Maday et al. ͓23͔͒, we pursue a minimumresidual projection: the ͑low-dimensional͒ infimizing space contains both the solution u and also the inf-sup infimizer at the n sample points; and the ͑high-dimensional͒ supremizing space is taken to be X. Stability is ensured and rigorous ͑sharp͒ error bounds are obtained-though technically the bounds are only asymptotic due to the approximation of the inf-sup parameter; and, despite the presence of X, the on-line complexity remains independent of the dimension of X-as in Section 3.3, we exploit affine parameter dependence and linear superposition to precompute the necessary inversions. In a second suite of much simpler and more general approaches ͑see Maday et al. ͓23͔ for one example in the symmetric case͒, we exploit minimum-residual or Petrov-Galerkin projections with infimizer-supremizer enriched, but still very low-dimensional, infimizing and supremizing spaces. Plausible but not yet completely rigorous arguments, and empirical evidence, suggest that stability is ensured and rigorous asymptotic ͑and sharp͒ error bounds are obtained.
In Maday et al. ͓23͔ we focus entirely on Method I a posteriori error estimator procedures; but Method II techniques are also appropriate. In particular, Method II approaches do not require accurate estimation of the inf-sup parameter; we thus need be concerned only with stability in designing our reduced-basis spaces.

Parabolic Partial Differential
Equations. The next extension considered is the treatment of parabolic partial differential equations of the form m(u t ,v;)ϭa (u,v;); typical examples are time-dependent problems such as unsteady heat conductionthe ''heat'' or ''diffusion'' equation. The essential new ingredient is the presence of the time variable, t.
The reduced-basis approximation and error estimator procedures are similar to those for noncompliant nonsymmetric problems, except that we now include the time variable as an additional parameter. Thus, as in certain other time-domain modelorder-reduction methods ͑Antoulas and Sorensen ͓25͔, Sirovich and Kirby ͓26͔͒, the basis functions are ''snapshots'' of the solution at selected time instants; however, in our case, we construct an ensemble of such series corresponding to different points in the non-time parameter domain D. For rapid convergence of the output approximation, the solutions to an adjoint problem, which evolves backward in time, must also be included in the reducedbasis space.
For the temporal discretization method, many possible choices are available. The most appropriate method, although not the only choice, is the discontinuous Galerkin method ͑Machiels et al. ͓27͔͒. The variational origin of the discontinuous Galerkin approach leads naturally to rigorous output bounds for Method I a posteriori error estimators; the Method II approach is also directly applicable. Under our affine assumption, off-line/on-line decompositions can be readily crafted; the complexity of the on-line stage ͑calculation of the output predictor and associated bound gap͒ is, as before, independent of the dimension of X.

Locally Nonaffine Parameter
Dependence. An important restriction of our methods is the assumption of affine parameter dependence. Although many property, boundary condition, load, and even geometry variations can indeed be expressed in the required form ͑2͒ for reasonably small Q, there are many problems, for example, general boundary shape variations, which do not admit such a representation. One simple approach to the treatment of this more difficult class of nonaffine problems is ͑i͒ in the off-line stage, store the n ϵu( n ), and ͑ii͒ in the on-line stage, directly evaluate the reduced-basis stiffness matrix as a( j , i ,). Unfortunately, the operation count ͑respectively, storage͒ for the on-line stage will now scale as O(N 2 dim(X)) ͑respectively, O(Ndim(X)͒, where dim(X) is the dimension of the truth ͑very fine͒ finite element approximation space: the resulting method may no longer be competitive with advanced iterative techniques; and, in any event, ''real-time'' response may be compromised.
We prefer an approach which is slightly less general but potentially much more efficient. In particular, we note that in many cases-for example, boundary geometry modification-the non-affine parametric dependence can be restricted to a small subdomain of ⍀, ⍀ II . We can then express our bilinear form a as an affine/nonaffine sum, a͑w,v; ͒ϭa I ͑ w,v; ͒ϩa II ͑ w,v; ͒. (59) Here a I , defined over ⍀ I , the majority of the domain, is affinely dependent on ; and a II , defined over ⍀ II , a small portion of the domain, is not affinely dependent on . It immediately follows that the reduced-basis stiffness matrix can be expressed as the sum of two stiffness matrices corresponding to contributions from a I and a II , respectively; that the stiffness matrix associated with a I admits the usual on-line/off-line decomposition described in Section 3.3; and that the stiffness matrix associated with a II requires storage ͑and inner product evaluation͒ only of i ͉ ⍀ II ͑ i restricted to ⍀ II ͒. The nonaffine contribution to the on-line computational complexity thus scales only as O(N 2 dim(X͉ ⍀ II )), where dim(X͉ ⍀ II ) refers ͑in practice͒ to the number of finite-element nodes located within ⍀ II , often extremely small. We thus recover a method that is ͑almost͒ independent of dim(X), though clearly the on-line code will be more complicated than in the purely affine case.
In the above we focus on approximation. As regards a posteriori error estimation, the nonaffine dependence of a ͑even lo-cally͒ precludes the precomputation and linear superposition strategy required by Method I ͑unless domain decomposition concepts are exploited ͑Machiels et al. ͓28͔͒; however, Method II directly extends to the locally nonaffine case.