Gershgorin and Rayleigh Bounds on the Eigenvalues of the Finite-Element Global Matrices via Optimal Similarity Transformations ()
1. Introduction
Knowledge, even approximate, of the extremal eigenvalues of the, large, positive definite, global stiffness matrix generated by the finite-element method (see [1] ), is essential for assessing the correctness of the numerical solution of the global algebraic system of equations set up by this variational method. Other numerical procedures related to this large algebraic system, such as iterative solution methods, are also greatly affected by the condition number, the ratio of the largest to the smallest eigenvalue, of the matrix (see [2] ). In this note we, theoretically and numerically, examine various practical procedures for the numerical, a-posteriori, ever better, bounding of the eigenvalues of the large global stiffness matrices generated by the finite-element method, predominantly based on a preparatory, most favorable, similarity transformation. Special attention is paid in this note to the common class of matrices having non-positive off-diagonal entries (see [3] [4] [5] ). For such matrices, application of the Gershgorin bounding method may be most conveniently carried out by a mere matrix vector multiplication, which may be, in turn, very effectively carried out on the element level (see [6] ).
2. Gershgorin’s Eigenvalue Bounds
The utterly simple Gershgorin and Rayleigh eigenvalue bound theorems (see [7] [8] ) are of such fundamental importance in computational linear algebra that we find it good to fully repeat them here.
Throughout this paper we denote a number by a lower case Greek letter, a vector by a lower case Roman letter, and a matrix by an upper case Roman letter.
The eigenvalue spectrum
of real symmetric matrix A is real. For such a matrix Gershgorin’s theorem assumes the simpler form.
Theorem (Gershgorin). Let
be symmetric, and so of a real eigensystem. Then every eigenvalue of A lies in, or on the end points of, at least one of the intervals
(1)
Namely,
(2)
for every eigenvalue of A.
Proof. Let λ be an eigenvalue of A and
the corresponding eigenvector, so that
. Assume that for this λ the kth component of
, is largest in magnitude and is normalized so that
and
,
. The kth equation of
is then
(3)
and
(4)
since
. We do not know what k is, but are sure that λ certainly lies in one of these intervals. □
Gershgorin’s theorem readily provides outside bounds on all eigenvalues of matrix A.
3. Positive-Definite Matrices with Non-Positive Off-Diagonal Entries
Positive definite and symmetric global stiffness matrices of a positive diagonal and non-negative off-diagonal entries are common in the finite elements, or finite differences, method. For such matrices the application of the Gershgorin theorem simplifies into a mere matrix-vector multiplication operation, efficiently carried out as a summation of such multiplications on the element level of the global mesh (see [6] ).
Say matrix A is symmetric,
and positive (semi) definite, and of non-positive off-diagonal entries, then obviously
(5)
where
is the smallest eigenvalue of matrix A, and
the largest. If matrix A is symmetric and positive definite, then
. For example
(6)
Gershgorin’s theorem correctly predicts here that matrix A is, at least, positive semi-definite, but it fails to ascertain that it is actually positive definite, with a positive lowest eigenvalue. To obtain a more realistic, nontrivial, lower bound on the lowest eigenvalue of matrix A we resort to similarity transformations designed to rescue the lower bound from triviality.
4. Rayleigh Quotient
Let matrix
be real and symmetric. If
is an eigenvector of A for corresponding eigenvalue
,
, then
, or
. The rational function, for variable vector
(7)
is the Rayleigh quotient of matrix A. It has some interesting properties of great practical and theoretical reach.
First property: The quotient
produces very accurate eigenvalue approximations from even not so good eigenvector approximations. Indeed, let v be an eigenvector corresponding to some
, and let
(8)
where
is the error vector.
Then
(9)
or
(10)
since
, implying that also
. Thus, here
(11)
Rayleigh quotient
produces a very accurate,
, approximation to an eigenvalue from an approximation x to the corresponding eigenvalue that is not very accurate, only
. Conversely, a very good approximation to an eigenvalue does not imply it coming from a very good approximation to an eigenvector.
Second property: If
, then for any vector
(12)
The Rayleigh quotient provides inner bounds on all eigenvalues of symmetric matrix A. Hence the Rayleigh bounds and the Gershgorin bounds complement each other.
Theorem (Rayleigh.) Let the eigenvalues of
be arranged in the ascending order
, with orthogonal eigenvectors
. Then
(13)
with the lower equality holding if and only if
, and the upper inequality holding if and only if
. Also
(14)
with the lower equality holding if and only if
, and the upper if and only if
. The two inequalities reduce to
(15)
for arbitrary, nonzero,
.
Proof. Vector
, orthogonal to
has the unique expansion
(16)
with which
(17)
We normalize x by
(18)
and use this equation to eliminate
from
to be left with
(19)
By assumption
if
and hence
(20)
or
, with equality holding if and only if
(21)
In case of distinct eigenvalues,
, equality holds if and only if
, and
. If eigenvalues repeat and
, then
need not be zero, but equality still holds if and only if x is in the invariant subspace spanned by the eigenvectors of
.
To prove the upper bound we use
(22)
to eliminate it from
, so as to be left with
(23)
and
with equality holding if and only if
. The proof to the second part of the theorem is the same. □
5. Perron’s Theorem on Positive Matrices
Positive matrices are common in the finite-element method. The following is a symmetric version of Perron’s theorem on positive matrices (see [9] ).
Theorem (Perron). If A is a symmetric positive matrix,
, then the eigenvector corresponding to the largest (positive) eigenvalue of A is positive and unique.
Proof. If
is a unit eigenvector corresponding to
, and
is such that
, then
(24)
and
is certainly positive. Moreover, since
the components of
cannot have different signs, for this would contradict the assumption that
maximizes
. Say then that
. But none of the
components can be zero since
, and obviously
. Hence
.
There can be no other positive vector orthogonal to
, and so the eigenvector, and also the largest eigenvalue
, are unique. □
The following is another important and useful statement on positive matrices.
Theorem. Suppose that A has a positive inverse
. Let x be any vector satisfying
,
,
. Then
(25)
Proof. Obviously
so that
(26)
and
(27)
To prove the other bound write
, observe that
, and have that
(28)
Hence, if
, then
(29)
This completes the proof.
6. Similarity Transformation, Similar Matrices
We will make also great use of the following fact.
Theorem. If
is an eigenvalue of matrix A, then
is also an eigenvalue of similar matrix
. Similar matrices A and
have the same eigenvalues.
Proof. We have
(30)
and the result follows. □
In this note we are greatly interested in accurate eigenvectors.
To fix ideas consider the typical finite-elements, or finite-differences, matrix
(31)
The matrix A is symmetric and positive definite,
, with a corresponding eigenvector that is positive. The fact that
is completely positive is a manifestation of the physical fact that a point force applied to the system causes all free points of the system to move, and all in the same direction.
To save Gershgorin’s theorem from the trivial, and useless prediction
, we propose to similarly transform matrix A with a positive diagonal matrix D,
, such that
(32)
The fact that diagonal matrix D is positive,
, guarantees that similar matrix
retains the entry sign pattern of original matrix A. Namely,
,
.
For example, with
(33)
Now Gershgorin’s theorem correctly recognizes that matrix A is positive definite.
For a better bound, matrix D needs to raise the lowest entry and to lower the highest entry of
. Thus,
The optimal D is for which all entries of
are equal, or
(34)
Namely, the optimal D is such that De is the eigenvector corresponding to the lowest eigenvalue of matrix A. With such a better D, obtained here by inverse iterations on, the here positive,
,
, we have
(35)
Taking the diagonal entries of matrix D as vector x we obtain from Rayleigh’s quotient
(36)
or
(37)
which are good enclosing bounds.
As for the highest eigenvalue of matrix A,
, we have from Gershgorin and Rayleigh that
(38)
with x obtained by the application of three steps of the power method to matrix A starting with
to have a reasonable approximate for the eigenvector corresponding to the highest eigenvalue of A.
Next we turn our attention to more realistic finite element examples.
7. Taut String (Cable)
The basic issues of this paper are best illustrated on some concrete cases. First we consider the one-dimensional problem of the taut string discretized by n linear finite elements of which the element stiffness matrix is
(39)
in which
is the size of the element, n being the number of mesh subdivisions. Little element matrix k is positive semi-definite with a zero eigenvalue for eigenvector
. Element stiffness matrix k is, moreover, such that
,
, and of this sign pattern is also any global matrix A assembled from it.
We assemble the element k matrices into the global matrix A, here over a mesh of five elements, impose the essential boundary condition of a fixed end point by deleting the first row and first column of the assembled global matrix, and are left with
(40)
Global stiffness matrix A is now positive definite,
. Matrix A is, moreover, tridiagonal with non-positive off-diagonal entries
.
We know (see [10] [11] ) that as the size of matrix
increases, its lowest eigenvalue decreases, actually
.
8. Taut String, Similarity Transformation Followed by a Gershgorin Bound
Being keenly interested in an actual, numerical lower bound on
we naively apply Gershgorin’s theorem directly to global stiffness matrix A of Equation (40) and obtain from it the disappointing
(41)
The bounding fails to confirm that matrix A is positive definite, only that it is positive semi-definite. We remark that the lower Gershgorin bound is obtained here as
(42)
inasmuch as
and
.
To avert a trivial Gershgorin bound we propose to similarly transform matrix A into P−1AP with a positive diagonal matrix P to facilitate the inversion, and subsequent matrix-matrix multiplications, while still maintaining
.
For example, we take
and have
(43)
and
(44)
triumphantly ascertaining that global stiffness matrix A is indeed positive definite.
We have already remarked that we desire matrix P to be such that all entries of
are nearly equal, or that
(45)
or that Pe is the (positive) eigenvector corresponding to
.
How to efficiently get good approximations to the first eigenvector of global stiffness matrix A is the subject of the rest of this note.
9. Taut String, Linearization of the Characteristic Polynomial
We know that as the size of matrix
increases, its lowest eigenvalue decreases, tending to zero, actually
(see again [10] [11] ). We start the linearization by writing
(46)
and have by a successive linearization for
of each equation that, approximately
(47)
Then, linearly
(48)
and it follows that the eigenvector corresponding to the lowest eigenvalue of matrix A is approximately
(49)
and, correspondingly
(50)
We put
and have from
(51)
the reasonable bounds
(52)
10. Taut String, Quadratization of the Characteristic Polynomial
Keeping quadratic terms of
in the characteristic equation of matrix
we have for vector x of Equation (46) the five quadratically curtailed equations of
:
(53)
and
(54)
resulting in
(55)
and
(56)
The success of the linearization, or quadratization, of the characteristic equation hinges on the theoretically assured fact (see [10] [11] ) that
, and that as the size of the finite-elements global stiffness matrix A increases
further decreases.
11. Taut String, Linearization Following a Shift
To work with a matrix of a lesser minimal eigenvalue we turn to the shifted matrix
(57)
from which we get the approximate linear characteristic polynomial
(58)
then
(59)
and
(60)
with
(61)
12. Taut String, Shifted Power Method
We continue with our quest for good approximations to the fundamental eigenvector of the finite elements global stiffness matrix A. We propose to first iterate for the highest eigenvalue and the corresponding eigenvector. We choose to use the power method since it requires but one vector-matrix multiplication per step. We start with
(62)
Then we undertake the shift
(63)
and apply to it the power method starting now with
(64)
to successively have
(65)
as upper and lower bounds on
.
13. Hanging String
The freely down hanging string (see [12] ) is of zero tension at the free lower end, and with the tension growing linearly towards the fixed upper hanging point, that carries the entire weight of the string.
The element stiffness matrix of the hanging string of size h is
(66)
assembled into the global stiffness matrix
(67)
including the essential boundary condition of zero movement of the hanging point. Global stiffness matrix A factors as
(68)
showing matrix A to be symmetric and positive definite. In fact,
and
.
We are seeking good lower and upper bounds on
, the lowest eigenvalue of global stiffness A.
14. Hanging String, Linearization of the Characteristic Polynomial
Here we have
(69)
with the linearized characteristic polynomial for
(70)
then
(71)
from which we get
(72)
and then the reasonable bounds
(73)
15. Hanging String, Second Sweep for a Linearized Characteristic Polynomial
Now we undertake the shift
(74)
intended to further reduce
, and have for
(75)
a linearization resulting in
(76)
and then the linearized characteristic polynomial
(77)
From this we have
(78)
by which we compute for original matrix A
(79)
and
, translating into the good bounds
(80)
16. Hanging String, Quadratization of the Characteristic Polynomial
Here, for vector x of Equation (75), and the quadratization of
we have
(81)
and, from the last equation of
, the quadratic approximate characteristic polynomial
(82)
and have from this
that
(83)
17. Hanging String, Power Method
The global stiffness matrix we take is still the modest
(84)
with corresponding eigenvectors
(85)
We start the power method with
(86)
to successively produce
(87)
then
for
(88)
It follows then, from Rayleigh and Gershgorin, that
(89)
18. Hanging String, Shifted Power Method
Next we turn our attention to computing, an ever better, approximation to
, the lowest eigenvalue of global stiffness matrix A of the hanging string. For this we take the shifted matrix
(90)
The power method applied to matrix B converges now to
, the eigenvector corresponding to the lowest eigenvalue of original matrix A.
We start with
(91)
and repeatedly compute
(92)
and have that
(93)
which we gladly accept.
19. The Four-Nodes Rectangular Membrane Element
Next we move on to two dimensional finite-element membrane problems, or the boundary value problem
(94)
where n is an outwardly normal to boundary S encompassing domain D.
The element stiffness and mass matrices for the rectangular four-nodes membrane finite element of sides a and b are
(95)
and
(96)
respectively. If
, then
(97)
with eigenvalues
(98)
Element stiffness matrix k of Equation (97) is of a positive diagonal and non-positive off-diagonal entries, and hence such is also any global matrix assembled from it.
Global stiffness matrix A for the modest
mesh is
(99)
with, as expected,
and
. The eigenvalues and eigenvectors of global stiffness A are
(100)
20. Rectangular Membrane, Power Method
We prefer the power method as it requires only a vector-matrix multiplication, We start the method with the initial
(101)
and keep multiplying, or raising to ever higher power, to have.
(102)
such that, by Rayleigh and Gershgorin
(103)
21. Rectangular Membrane. Shifted Power Method
Now we go after the, more interesting, lowest eigenvalue of the global stiffness matrix A and replace it by the shifted matrix
(104)
We start with
and continue to have
(105)
and
(106)
which we appraise as quite good and tight numerical bounds.
22. Triangular Membrane, Triangular Finite Elements
The linear, first order, membrane finite element stiffness matrix k for a triangle of sides
and area T is
(107)
where
are the constant matrices
(108)
Otherwise, the element stiffness matrix is written, by the aid of the law of the cosines, as
(109)
where
is the angle opposite
. If
, then
, and so is any global stiffness matrix assembled from it.
We consider now an equilateral triangular membrane discretized by four equilateral triangular finite elements per side. The membrane is held fixed on one side, to produce the 10 × 10 symmetric and positive definite global stiffness matrix
(110)
with, as expected,
and
. The ten computed eigenvalues of this matrix are:
(111)
23. Triangular Membrane, Power Method
We start our iterative quest for the highest eigenvalue of matrix A with
, and then continue with the power method to have
(112)
and thus, by Rayleigh and Gershgorin
(113)
24. Triangular Membrane, Shifted Power Method
Next we turn our attention to the lowest eigenvalue
of global stiffness matrix A of the triangular membrane. For this we apply the power method again, but this time to the shifted matrix
(114)
in which
is the rayleigh quotient for
obtained by the power method in the previous section.
We start with an arbitrary
and continue with the shifted power method to have
(115)
and
(116)
25. Conclusions
In this note we have considered the common case of the positive definite and symmetric global finite element stiffness matrices having non positive off-diagonal entries. The finite element global stiffness matrix readily becomes very large with an ever smaller least eigenvalue.
We have shown here how optimal similarity transformations of this matrix, requiring only matrix-vector multiplication, lead to eminently practical and reliable numerical iterative algorithms for tight realistic Rayleigh and Gershgorin bounds on the least eigenvalue of this large finite-elements global stiffness matrix.