A New Type of Restarted Krylov Methods

In this paper we present a new type of Restarted Krylov methods for calculating peripheral eigenvalues of symmetric matrices. The new framework avoids the Lanczos tridiagonalization process, and the use of polynomial filtering. This simplifies the restarting mechanism and allows the introduction of several modifications. Convergence is assured by a monotonicity property that pushes the eigenvalues toward their limits. The Krylov matrices that we use lead to fast rate of convergence. Numerical experiments illustrate the usefulness of the proposed approach.


Introduction
In this paper we present a new type of Restarted Krylov methods. Given a symmetric matrix n n G × ∈  , the method is aimed at calculating a cluster of k exterior eigenvalues of G. Other names for such eigenvalues are "peripheral eigenvalues" and "extreme eigenvalues". The method is best suited for handling large sparse matrices in which a matrix-vector product needs only 0(n) flops.
Another underlying assumption is that k 2 is considerably smaller than n. The need for computing a few extreme eigenvalues of such a matrix arises in many applications, see [1]- [21].
The traditional restarted Krylov methods are classified into three types of restarts: "Explicit restart" [1], [9], [11], "Implicit restart" [1], [3], [12], and "Thick restart" [18], [19]. See also [7], [11], [14], [16]. When solving symmetric eigenvalue problems all these methods are carried out by repeated use of the Lanczos tridiagonalization process, and the use of polynomial filtering to determine the related starting vectors. This way each iteration generates a new tridiagonal matrix and computes its eigensystem. The method proposed in this How to cite this paper: Dax A. Dax paper is based on a different framework, one that avoids these techniques. The basic iteration of the new method have recently been presented by this author in [4]. The driving force is a monotonicity property that pushes the estimated eigenvalues toward their limits. The rate of convergence depends on the quality of the Krylov matrix that we use. In this paper we introduce a modified scheme for generating the Krylov matrix. This leads to dramatic improvement in the rate of convergence, and turns the method into a powerful tool.
Let the eigenvalues of G be ordered to satisfy Then the new algorithm is built to compute one of the following four types of target clusters that contain k extreme eigenvalues.
A two-sides cluster is a union of a right-side cluster and a left-side cluster.
, , n λ λ λ    . Note that although the above definitions refer to clusters of eigenvalues, the algorithm is carried out by computing the corresponding k eigenvectors of G.
The subspace that is spanned by these eigenvectors is called the target space.

The basic iteration
The qth iteration, 0,1, 2, q =  , is composed of the following five steps. The first step starts with a matrix n k q V × ∈  that contains "old" information on the target space, a matrix n q Y × ∈   that contains "new" information, and a matrix that includes all the known information. The matrix q X has p k = +  orthonormal columns. That is (Typical values for  lie between k to 2k.) Step Step 2: Collecting new information. Compute a Krylov matrix n q B × ∈   that contains new information on the target space.
Step 3: Orthogonalize the columns of q B against the columns of q V 1 + . There are several ways to achieve this task. In exact arithmetic the resulting matrix, q Z , satisfies the Gram-Schmidt formula ( ) Step 4: Build an orthonormal basis of Range (Z q ). Compute a matrix, whose columns form an orthonormal basis of Range (Z q ). This can be done by a QR factorization of q Z . (If rank (Z q ) is smaller than  , then  is temporarily reduced to be rank (Z q ).) Step 5: Define 1 q X + by the rule shown that each iteration gives a better approximation of the target cluster.

The Monotonicity Property
The monotonicity property is an important feature of the new iteration, whose proof is given in [4]. Yet, in order to make this paper self-contained, we provide the proof. We start by stating two well-known interlacing theorems, e.g., [8], [10] and [20].
In particular, for 1 k n = − we have the interlacing relations have the eigenvalues (2.2). Then the eigenvalues of H and G satisfy (2.3) and (2.4).
Let us turn now to consider the qth iteration of the new method, Assume first that the algorithm is aimed at computing a cluster of k right-side eigenvalues of G, and let the eigenvalues of the matrix On the other hand from Corollary 2 we obtain that Hence by combining these relations we see that  . The treatment of a left-side cluster is done in a similar way. Assume that the algorithm is aimed at computing a cluster of k left-side eigenvalues of G, Then similar arguments show that Recall that a two-sides cluster is the union of a right-side cluster and a leftside one. In this case the eigenvalues of q S that correspond to the right-side satisfy (2.6) while eigenvalues of q S that correspond to the left-side satisfy (2.7).
A similar situation occurs in the computation of a dominant cluster, since a dominant cluster is either a right-side cluster, a left-side cluster, or a two-sides cluster.

The Basic Krylov Matrix
The basic Krylov information matrix has the form [ ] 1 2 , , , , where the sequence  is a vector of ones. Note that there is no point in e, since in the next step q B is orthogonalized against The proof of the Kaniel-Paige-Saad bounds relies on the properties of Chebyshev polynomials, while the building of these polynomials is carried out by using a three term recurrence relation, e.g. [10], [11]. This observation suggests that in order to achieve these bounds the algorithm for generating our Krylov sequence should use a "similar" three term recurrence relation. Indeed this feature is one of the reasons that make the Lanczos recurrence so successful, see ( [11], p. 138). Below we describe an alternative three term recurrence relation, which is based on the Modified Gram-Schmidt (MGS) orthogonalization process.
Let n ∈ r  be a given vector and let n ∈ q  be a unit length vector. That is  , is carried out as follows. The preparations part a) Compute the starting vector: The iterative part For 3, , j =   , compute j b as follows: The reorthogonalization step is aimed to ensure that the numerical rank of q B will stay close to  . Yet for small values of  it is not essential.

The Power-Krylov Matrix
Assume for a moment that the algorithm is aimed at calculating a cluster of k dominant eigenvalues. Then the Kaniel-Paige-Saad bounds suggest that slow rate of convergence is expected when these eigenvalues are poorly separated from the other eigenvalues. Indeed, this difficulty is seen in Table 2, when the basic Krylov matrix is applied on problems like "Very slow geometric" and "Dense equispaced". Let 2 ν ≥ be a small integer. Then the larger eigenvalues of the powered matrix, G ν , are better separated than those of G. This suggests that a faster rate of convergence can be gained by replacing G with G ν . cost of a matrix-vector product and the distribution of the eigenvalues. As noted above, it is expected to reduce the number of iterations when the k largest eigenvalues of G are poorly separated from the rest of the spectrum. See Table 3. Another advantage of this approach is that  is kept small. Note that although the computational effort per iteration increases, a smaller portion of time is spent on orthogonalizations and on the Rayleigh-Ritz procedure.

The Initial Orthonormal Matrix
To start the algorithm we need to supply an "initial" orthonormal matrix,  . Yet a random starting vector is equally good. In the next section we shall see that the above choice of 0 X is often sufficient to provide accurate estimates of the desired eigenpairs.

Numerical Experiments
In this section we describe some experiments with the proposed methods.  Table 2 and Table 3 provide the number of iterations that were needed to satisfy the inequality Thus, for example, from Table 2 we see that only 4 iterations are needed when the algorithm computes 200 k = eigenvalues of the Equispaced test matrix.
The ability of the basic Krylov matrix to achieve accurate computation of the eigenvalues is illustrated in Table 2. We see that often the algorithm terminates within a remarkably small number of iterations. Observe that the method is highly successful in handling low-rank matrices, matrices which are nearly low-  In such matrices the initial orthonormal matrix, 0 X , is sufficient to provide accurate eigenvalues. Note also that the method has no difficulty in computing multiple eigenvalues.
As expected, a slower rate of convergence occurs when the dominant eigenalues that we seek are poorly separated from the other eigenvalues. This situation is demonstrated in matrices like "Dense equispaced" or "Very slow geometric". Yet, as Table 3 shows, in such cases the Power-Krylov method leads to considerable reduction in the number of iterations.

Concluding Remarks
The new type of Restarted Krylov methods avoids the use of Lanczos algorithm.
This simplifies the basic iteration, and clarifies the main ideas behind the    The proof indicates too important points. First, there is a lot of freedom in choosing the information matrix, and that monotonicity is guaranteed as long as we achieve proper orthogonalizations. Second, the rate of convergence depends on the "quality" of the information matrix. This raises the question of how to define this matrix. Since the algorithm is aimed at computing a cluster of exterior eigenvalues, a Krylov information matrix is a good choice. In [4] we have tested the classical Krylov matrix where j b is obtained by normalizing However, as shown in [5], this matrix suffers from certain deficiencies. In this paper the Krylov basis is built by a three term recurrence relation, which leads to dramatic reduction in the number of iterations.
Indeed, the results of our experiments are quite encouraging. We see that the algorithm requires a remarkably small number of iterations. In particular, it efficiently handles various kinds of low-rank matrices. In these matrices the initial orthonormal matrix is often sufficient for accurate computation of the desired eigenpairs. The algorithm is also successful in computing eigenvalues of "difficult" matrices like "Dense equispaced" or "Very slow geometric decay".