SVD-MPE: An SVD-Based Vector Extrapolation Method of Polynomial Type

An important problem that arises in different areas of science and engineering is that of computing the limits of sequences of vectors $\{\xx_m\}$, where $\xx_m\in \C^N$, $N$ being very large. Such sequences arise, for example, in the solution of systems of linear or nonlinear equations by fixed-point iterative methods, and $\lim_{m\to\infty}\xx_m$ are simply the required solutions. In most cases of interest, however, these sequences converge to their limits extremely slowly. One practical way to make the sequences $\{\xx_m\}$ converge more quickly is to apply to them vector extrapolation methods. Two types of methods exist in the literature: polynomial type methods and epsilon algorithms. In most applications, the polynomial type methods have proved to be superior convergence accelerators. Three polynomial type methods are known, and these are the {minimal polynomial extrapolation} (MPE), the {reduced rank extrapolation} (RRE), and the {modified minimal polynomial extrapolation} (MMPE). In this work, we develop yet another polynomial type method, which is based on the singular value decomposition, as well as the ideas that lead to MPE. We denote this new method by SVD-MPE. We also design a numerically stable algorithm for its implementation, whose computational cost and storage requirements are minimal. Finally, we illustrate the use of {SVD-MPE} with numerical examples.


Introduction and background
An important problem that arises in different areas of science and engineering is that of computing limits of sequences of vectors {x m }, 1 where x m ∈ C N , the dimension N being very large in many applications.Such vector sequences arise, for example, in the numerical solution of very large systems of linear or nonlinear equations by fixed-point iterative methods, and lim m→∞ x m are simply the required solutions to these systems.One common source of such systems is the finite-difference or finiteelement discretization of continuum problems.In later chapters, we will discuss further problems that give rise to vector sequences whose limits are needed.
In most cases of interest, however, the sequences {x m } converge to their limits extremely slowly.That is, to approximate s = lim m→∞ x m , with a reasonable prescribed level of accuracy, by x m , we need to consider very large values of m.Since, the vectors x m are normally computed in the order m = 0, 1, 2, . . ., it is clear that we have to compute many such vectors until we reach one that has acceptable accuracy.Thus, this way of approximating s via the x m becomes very expensive computationally.
Nevertheless, we may ask whether we can do something with those x m that are already available, to somehow obtain new approximations to s that are better than each individual available x m .The answer to this question is in the affirmative for at least a large class of sequences that arise from fixed-point iteration of linear and nonlinear systems of equations.One practical way of achieving this is by applying to the sequence {x m } a suitable convergence acceleration method (or extrapolation method).
Of course, in case lim m→∞ x m does not exist, it seems that no use could be made of the x m .Now, if the sequence {x m } is generated by an iterative solution of a linear or nonlinear system of equations, it can be thought of as "diverging from" the solution s of this system.We call s the antilimit of {x m } in such a case.It turns out that vector extrapolation methods can be applied to such divergent sequences {x m } to obtain good approximations to the relevant antilimits, at least in some cases.Two different types of vector extrapolation methods exist in the literature: 1. Polynomial type methods: The minimal polynomial extrapolation (MPE) of Cabay and Jackson [4], the reduced rank extrapolation (RRE) of Eddy [5] and Mesina [13], and the modified minimal polynomial extrapolation (MMPE) of Brezinski [3], Pugachev [14] and Sidi, Ford, and Smith [23].

Epsilon algorithms:
The scalar epsilon algorithm (SEA) of Wynn [28] (which is actually a recursive procedure for implementing the transformation of Shanks [15]), the vector epsilon algorithm (VEA) of Wynn [29], and the topological epsilon algorithm (TEA) of Brezinski [3].
The paper by Smith, Ford, and Sidi [25] gives a review of all these methods (except MMPE) that covers the developments in vector extrapolation methods until the end of the 1970s.For up-to-date reviews of MPE and RRE, see Sidi [20] and [21].
Vector extrapolation methods are used effectively in various branches of science and engineering in accelerating the convergence of iterative methods that result from large sparse systems of equations.
All of these methods have the useful feature that their only input is the vector sequence {x m } whose convergence is to be accelerated, nothing else being needed.In most applications, however, the polynomial type methods, especially MPE and RRE, have proved to be superior convergence accelerators; they require much less computation than, and half as much storage as, the epsilon algorithms for the same accuracy.
In this work, we develop yet another polynomial type method, which is based on the singular value decomposition (SVD), as well as some ideas that lead to MPE.We denote this new method by SVD-MPE.We also design a numerically stable algorithm for its implementation, whose computational cost and storage requirements are minimal.The new method is described in the next section.In Section 3, we show how the error in the approximation produced by SVD-MPE can be estimated at zero cost in terms of the quantities already used in the construction of the approximation.In Section 4, we give a very efficient algorithm for implementing SVD-MPE.In Section 5, we derive determinant representations for the approximations produced by SVD-MPE, while in Section 6, we show that this method is a Krylov subspace method when applied to vector sequences that result from the solution of linear systems via fixed-point iterative schemes.Finally, in Section 7, we illustrate its use with two numerical examples.
Before closing, we state the (reduced version of) the well known singular value decomposition (SVD) theorem.For different proofs, we refer the reader to Golub and Van Loan [8], Horn and Johnson [11], Stoer and Bulirsch [26], and Trefethen and Bau [27], for example.
In case rank(A) = t, there holds σ i > 0, i = 1, . . ., t, and the rest of the σ i are zero.
Remark: The σ i are called the singular values of A and the v i and u i are called the corresponding right and left singular vectors of A, respectively.We also have

Development of SVD-MPE
In what follows, we use boldface lower case letters for vectors and boldface upper case letters for matrices.In addition, we will be working with general inner products (• , •) and the l 2 norms • induced by them: These are defined as follows: (2.1) Of course, the standard Euclidean inner product a * b and the l 2 norm √ a * a induced by it are obtained by letting M = I in (2.1) and L k = I in (2.2); we will denote these norms by • 2 .(We will denote by I the identity matrix in every dimension.)

Summary of MPE
We begin with a brief summary of minimal polynomial extrapolation (MPE).We use the ideas that follow to develop our new method.
Given the vector sequence {x m } in C N , we define and, for some fixed n, define the matrices U k via Clearly, there is an integer k 0 ≤ N , such that the matrices U k , k = 0, 1, . . ., k 0 − 1, are of full rank, but (Of course, this is the same as saying that {u 0 , u 1 , . . ., u k 0 −1 } is a linearly independent set, but {u 0 , u 1 , . . ., u k 0 } is not.)Following this, we pick a positive integer k < k 0 and let the vector This minimization problem can also be expressed as in min and, as is easily seen, c ′ is the standard least-squares solution to the linear system U k−1 c ′ = −u n+k , which, when k < N , is overdetermined, and generally inconsistent.With c 0 , c 1 , . . ., c k−1 determined, set c k = 1, and compute the scalars γ 0 , γ 1 , . . ., γ k via provided k j=0 c j = 0. Note that (2.9) Finally, set as the approximation to s, whether s is the limit or antilimit of {x m }.
What we have so far is only the definition (or the theoretical development) of MPE as a method.It should not be taken as an efficient computational procedure (algorithm), however.For this topic, see [18], where numerically stable and computationally and storagewise economical algorithms for MPE and RRE are designed for the case in which M = I.A well documented FORTRAN 77 code for implementing MPE and RRE in a unified manner is also provided in [18, Appendix B].

Development of SVD-MPE
We start by observing that the unconstrained minimization problem for MPE given in (2.7) can also be expressed as a superficially "constrained" minimization problem as in min For the SVD-MPE method, we replace this "constrained" minimization problem by the following actual constrained minimization problem: With c 0 , c 1 , . . ., c k determined, we again compute γ 0 , γ 1 , . . ., γ k via provided k j=0 c j = 0, noting again that Finally, we set as the SVD-MPE approximation to s, whether s is the limit or antilimit of {x m }.
Of course, the minimization problem in (2.12) has a solution for c = [c 0 , c 1 , . . ., c k ] T .Let σ min = U k c M for this (optimal) c.Lemma 2.1 that follows next gives a complete characterization of σ min and the (optimal) c.Lemma 2.1 Let σ k0 , σ k1 , . . ., σ kk be the singular values of the N × (k + 1) matrix ) and let h ki be the corresponding right singular vectors of U k , that is,

18)
Assuming that σ kk , the smallest singular value of U k , is simple, the (optimal) solution c to the minimization problem in (2.12) is unique (up to a multiplicative constant τ , |τ | = 1), and is given as in Proof.The proof is achieved by observing that, with c = L so that the problem in (2.12) becomes We leave the details to the reader.

Remarks:
1.In view of the nature of the solution for the (optimal) c involving singular values and vectors, as described in Lemma 2.1, we call this new method SVD-MPE.

Note that if rank (U
, and we therefore have σ kk > 0. 3. Of course, s n,k exists if and only if the (optimal) c = [c 0 , c 1 , . . ., c k ] T satisfies k j=0 c j = 0.In addition, by (2.13), the γ i are unique when σ kk is simple.
Before we go on to the development of our algorithm in the next section, we state the following result concerning the finite termination property of SVD-MPE, whose proof is very similar to that pertaining to MPE and RRE given in [21]: Theorem 2.2 Let s be the solution to the nonsingular linear system x = T x + d, and let {x m } be the sequence obtained via the fixed-point iterative scheme x m+1 = T x m +d, m = 0, 1, . . ., with x 0 chosen arbitrarily.If k is the degree of the minimal polynomial of T with respect to ǫ n = x n − s (equivalently, with respect to u n ), 2 then s n,k produced by SVD-MPE satisfies s n,k = s.

Error estimation
We now turn to the problem of estimating at zero cost the error s n,k − s, whether s is the limit or antilimit of {x m }.Here we assume that s is the solution to the system of equations and that the vector sequence {x m } is obtained via the fixed-point iterative scheme x 0 being the initial approximation to the solution s.

Now, if
x is some approximation to s, then a good measure of the error x − s in x is the residual vector r(x) of x, namely, This is justified since lim x→s r(x) = r(s) = 0. We consider two cases: In this case, we have and, therefore, by and thus In this case, assuming that lim m→∞ x m = s and expanding f (x m ) about s, we have 2 Given a matrix B ∈ C r×r and a nonzero vector a ∈ C r , the monic polynomial P (λ) is said to be a minimal polynomial of B with respect to a if P (B)a = 0 and if P (λ) has smallest degree.It is easy to show that the minimal polynomial P (λ) of B with respect to a exists, is unique, and divides the minimal polynomial of B, which in turn divides the characteristic polynomial of B. [Thus, the degree of P (λ) is at most r, and its zeros are some or all of the eigenvalues of B.] Moreover, if Q(B)a = 0 for some polynomial Q(λ) with deg Q > deg P , then P (λ) divides Q(λ).Concerning this subject, see Householder [12], for example.
where F (x) is the Jacobian matrix of the vector-valued function f (x) evaluated at x. Recalling that s = f (s), we rewrite this in the form from which, we conclude that the vectors x m and x m+1 satisfy the approximate equality That is, for all large m, the sequence {x m } behaves as if it were being generated by an N -dimensional approximate linear system of the form (I−T )x ≈ d through Now, we can compute U k γ M at no cost in terms of the quantities that result from our algorithm, without having to actually compute U k γ itself.Indeed, we have the following result: Theorem 3.1 Let σ kk be the smallest singular value of U k and let h kk be the corresponding right singular vector.Then the vector U k γ resulting from s n,k satisfies Proof.First, the solution to (2.12) is c = L −1/2 h kk by (2.19).Next, letting α = k j=0 c j , we have γ = c/α by (2.13).Consequently, Thus, by Lemma 2.1, we have which is the required result.

Algorithm for SVD-MPE
We now turn to the design of a good algorithm for constructing numerically the approximation s n,k that results from SVD-MPE.We note that matrix computations in floating-point arithmetic must be done with care, and this is what we would like to achieve here.
In this section, we let M = I and L k = I for simplicity.Thus, U k = U k .Since there is no room for confusion, we will also use σ i , h i , and g i to denote σ ki , h ki , and g ki , respectively.
As we have seen in Section 2, to determine s n,k , we need h k , the right singular vector of U k corresponding to its smallest singular value σ k .Now, σ k and h k can be obtained from the singular value decomposition (SVD) of U k ∈ C N ×(k+1) .Of course, the SVD of U k can be computed by applying directly to U k the algorithm of Golub and Kahan [7], for example.Here we choose to apply SVD to U k in an indirect way, which will result in a very efficient algorithm for SVD-MPE that is economical both computationally and storagewise in an optimal way.Here are the details of the computation of the SVD of U k , assuming that rank (U k ) = k + 1: 1. We first compute the QR factorization of U k in the form where and R k is upper triangular with positive diagonal elements, that is, Of course, we can carry out the QR factorizations in different ways.Here we do this by the modified Gram-Schmidt process (MGS) as follows: 1. Compute r 00 = u n 2 and q 0 = u n /r 00 .2. For j = 1, . . ., k do Set u Note that the matrices Q k and R k are obtained from Q k−1 and R k−1 , respectively, as follows: For MGS, see [8], for example.
2. We next compute the SVD of R k : By Theorem 1.1, there exist unitary matrices In addition, since R k is nonsingular by our assumption that rank (U k ) = k + 1, we have that σ i > 0 for all i.Consequently, σ k > 0.
3. Substituting (4.7) in (4.1), we obtain the following true singular value decomposition of U k : Thus, σ i , the singular values of R k , are also the singular values of U k , and h i , the corresponding right singular vectors of R k , are also the corresponding right singular vectors of U k .[Of course, the g i are corresponding left singular vectors of U k .Note that, unlike Y , H, and Σ, which we must compute for our algorithm, we do not need to actually compute G.The mere knowledge that the SVD of U k is as given in (4.8) suffices to conclude that c = h k is the required optimal solution to (2.12), and continue with the development of our algorithm.] With c = h k already determined, we next compute the γ i as in (2.13); that is, provided k j=0 c j = 0. Next, by the fact that and by (2.14), we can re-express s n,k in (2.15) in the form where the ξ i are computed from the γ j as in Making use of the fact that where the q i and the r ij are exactly those that feature in (4.2) and (4.3), we next rewrite (4.10) as in Thus, the computation of s n,k can be carried out economically as in Of course, Q k−1 η is best computed as a linear combination of the columns of Q k−1 , hence (4.14) is computed as in It is clear that, for the computation in (4.14) and (4.15), we need to save both Q k and R k .This completes the design of our algorithm for implementing SVD-MPE.For convenience, we provide a systematic description of this algorithm in Table 4.1.
Note that the input vectors x n+i , i = 1, . . ., k + 1, need not be saved; actually, they are overwritten by u n+i , i = 0, 1, . . ., k, as the latter are being computed.As is clear from the description of MGS given above, we can overwrite the matrix U k simultaneously with the computation of Q k and R k , the vector q n+j overwriting u n+j as soon as it is computed, j = 0, 1, . . ., k; that is, at any stage of the QR factorization, we store k+2 N -dimensional vectors in the memory.Since N >> k in our applications, the storage requirement of the (k + 1) × (k + 1) matrix R k is negligible.So is the cost of computing the SVD of R k , and so is the cost of computing the (k + 1)-dimensional vector η.Thus, for all practical purposes, the computational and storage requirements of SVD-MPE are the same as those of MPE.
Remark: If we were to compute the SVD of U k , namely, U k = GΣH * , directly, and not by (i) first carrying out the QR factorization of U k as U k = Q k R k , and (ii) then computing the SVD of R k as R k = Y ΣH * , then we would need to waste extra resources in carrying out the computation of We would either have to save U k while computing the matrix G in its singular value decomposition.Thus, we would need to save two N × (k + 1) matrices, namely, U k and G in core memory simultaneously.
2. In case we are worried about storage, therefore, do not wish to save U k , we need to recompute the vectors x n , x n+1 , . . ., x n+k in order to compute s n,k .Thus, we would need to compute these vectors twice to complete the determination of s n,k .
Thus, the approach we have proposed here for carrying out the singular value decomposition of U k enables us to save extra computing and storage very conveniently.

Determinant representations for SVD-MPE
In [16] and [23], determinant representations were derived for the vectors s n,k that are produced by the vector extrapolation methods MPE, RRE, MMPE, and TEA.These representations have turned out to be very useful in the analysis of the algebraic and analytic properties of these methods.In particular, they were used for obtaining interesting recursion relations among the s n,k and in proving sharp convergence and stability theorems for them.We now derive two analogous determinant representations for s n,k produced by SVD-MPE.
The following lemma, whose proof can be found in [23, Section 3], will be used in this derivation in Theorem 5.2.Lemma 5.1 Let u i,j and γ j be scalars and let the γ j satisfy the linear system Then, whether v j are scalars or vectors, there holds where provided D(1, 1, . . ., 1) = 0.In case the v i are vectors, the determinant D(v 0 , v 1 , . . ., v k ) is defined via its expansion with respect to its first row.
For convenience of notation, we will write Then L k−1 is the principal submatrix of L k obtained by deleting the last row and the last column of L k .In addition, L k−1 is hermitian positive definite just like L k .
Theorem 5.2 that follows gives our first determinant representation for s n,k resulting from SVD-MPE and is based only on the smallest singular value σ kk of U k and the corresponding right singular vector h kk .
Proof.With U k as in (2.16), we start by rewriting (2.18) in the form Theorem 5.3 Let U k be as in (2.16), and let be the singular value decomposition of U k ; that is, where D(v 0 , v 1 , . . ., v k ) is the (k+1)×(k+1) determinant defined as in (5.3) in Lemma 5.1 with the u i,j as in (5.11).
Therefore, Lemma 5.1 applies with u i,j as in (5.11), and the result follows.

SVD-MPE for linearly generated sequences
In Sidi [17], we discussed the connection of the extrapolation methods MPE, RRE, and TEA with Krylov subspace methods.We now want to extend the treatment of [17] to SVD-MPE.Here we recall that a Krylov subspace method is also a projection method and that a projection method is defined uniquely by its right and left subspaces. 4In the next theorem, we show that SVD-MPE is a bone fide Krylov subspace method and we identify its right and left subspaces.
Proof.With the x m generated as above, we have and Upon substituting T = I − C in this equality, we obtain (6.1).
To prove (6.2), we first recall that U k γ = r(s k ) by (3.1).By this and by (5.15), the result in (6.2) follows. 4A projection method for the solution of the linear system Cx = d, where C ∈ C N×N , is defined as follows: Let Y and Z be k-dimensional subspaces of C N and let x0 be a given vector in C N .Then the projection method produces an approximation s k to the solution of Cx = d as follows: (i) s k = x0 + y, y ∈ Y, (ii) h * r(s k ) = 0 for every h ∈ Z. Y and Z are called, respectively, the right and left subspaces of the method.If Y is the Krylov subspace K k (C; r0) = span{r0, Cr0, . . ., C k−1 r0}, where r0 = d − Cx0, then the projection method is called a Krylov subspace method.

Numerical examples
We now provide two examples that show the performance of SVD-MPE and compare it with MPE.In both examples, SVD-MPE and MPE are implemented with the standard Euclidean inner product and the norm induced by it.Thus, M = I and L k = I throughout.
As we have already mentioned, a major application area of vector extrapolation methods is that of numerical solution of large systems of linear or nonlinear equations ψ(x) = 0 by fixed-point iterations x m+1 = f (x m ).[Here x = f (x) is a possibly preconditioned form of ψ(x) = 0.] For SVD-MPE, as well as all other polynomial methods discussed in the literature, the computation of the approximation s n,k to s, the solution of ψ(x) = 0, requires k + 1 of the vectors x m to be stored in the computer memory.For systems of very large dimension N , this means that we should keep k at a moderate size.In view of this limitation, a practical strategy for systems of equations is cycling, for which n and k are fixed.Here are the steps of cycling: C0.Choose integers n ≥ 0, and k ≥ 1, and an initial vector x 0 .
C3.If s n,k satisfies accuracy test, stop.Otherwise, set x 0 = s n,k , and go to Step C1.
We will call each application of steps C1-C3 a cycle, and denote by s (r) n,k the s n,k that is computed in the rth cycle.We will also denote the initial vector x 0 in step C0 by s , and is symmetric with respect to both main diagonals, and T ∈ R N ×N The vector d is such that the exact solution to x = T x + d is s = [1, 1, . . ., 1] T .We have ρ(T ) < 1, so that {x m } converges to s.
Figure 1 shows the l 2 norms of the errors in s n,k , n = 0, 1, . . ., with k = 5 fixed.Here N = 100.Note that all of the approximations s n,5 make use of the same (infinite) vector sequence {x m }, and, practically speaking, we are looking at how the methods behave as n → ∞.It is interesting to see that SVD-MPE and MPE behave almost the same.Although we have a rigorous asymptotic theory confirming the behavior of MPE in this mode as observed in Figure 1 (see [16], [19]- [22]), we do not have any such theory for SVD-MPE at the present.The equation is discretized on a square grid by approximating u xx , u yy , u x , and u y by centered differences with truncation errors O(h 2 ).Thus, letting h = 1/ν, and (x i , y j ) = (ih, jh), i, j = 0, 1, . . ., ν, and u x (x i , y j ) ≈ u(x i+1 , y j ) − u(x i−1 , y j ) 2h , u y (x i , y j ) ≈ u(x i , y j+1 ) − u(x i , y j−1 ) 2h , and −∇ 2 u(x i , y j ) ≈ 4u(x i , y j ) − u(x i+1 , y j ) − u(x i−1 , y j ) − u(x i , y j+1 ) − u(x i , y j−1 ) h 2 , we replace the differential equation by the finite difference equations 4u ij − u i+1,j − u i−1,j − u i,j+1 − u i,j−1 h 2 + Cu ij u i+1,j − u i−1,j + u i,j+1 − u i,j−1 2h = f (x i , y j ), 1 ≤ i, j ≤ ν − 1, with u 0,j = u i,0 = u ν,j = u i,ν = 0 ∀ i, j.
Here u ij is the approximation to u(x i , y j ), as usual.
We first write the finite difference equations in a way that is analogous to the PDE written in the form −∇ 2 u = f − Cu(u x + u y ), and split the matrix representing −∇ 2 to enable the use of the Jacobi and Gauss-Seidel methods as the iterative procedures to generate the sequences {x m }.
Figures 3 and 4 show the l 2 norms of the errors in s n,k from SVD-MPE and MPE in the cycling mode with n = 0 and k = 20, the iterative procedures being, respectively, that of Jacobi and that of Gauss-Seidel for the linear part −∇ 2 u of the PDE.Here ν = 100, so that the number of unknowns (the dimension) is N = 99 2 . )

Theorem 6 . 1
Let s be the unique solution to the linear system Cx = d, which we express in the form(I − T )x = d ⇒ x = T x + d; T = I − C,and let the vector sequence {x m } be produced by the fixed-point iterative schemex m+1 = T x m + d, m = 0, 1, . . ..Define the residual vector of x via r(x) = d − Cx.Let also s k ≡ s 0,k be the approximation to s produced by SVD-MPE.Then the following are true:

Figure 2 Example 7 . 2
Figure 2 shows the l 2 norm of the error in s n,k in the cycling mode with n = 0 and k = 20.Now N = 1000.

Figure 2 :
Figure 2: l 2 norm of error in s 0,20 in the cycling mode, from MPE and SVD-MPE, for Example 7.1 with N = 1000.