Matrix Functions of Exponential Order

Both the theoretical and practical investigations of various dynamical systems need to extend the definitions of various functions defined on the real axis to the set of matrices. To this end one uses mainly three methods which are based on 1) the Jordan canonical forms, 2) the polynomial interpolation, and 3) the Cauchy integral formula. All these methods give the same result, say g(A), when they are applicable to given function g(t) and matrix A. But, unfortunately, each of them puts certain restrictions on g(t) and/or A, and needs tedious computations to find explicit exact expressions when the eigen-values of A are not simple. The aim of the present paper is to give an alternate method which is more logical, simple and applicable to all functions (continuous or discontinuous) of exponential order. It is based on the two-sided Laplace transform and analytical continuation concepts, and gives the result as a linear combination of certain n matrices determined only through A. Here n stands for the order of A. The coefficients taking place in the combination in question are given through the analytical continuation of g(t) (and its derivatives if A has multiple eigen-values) to the set of eigen-values of A (numerical computation of inverse transforms is not needed). Some illustrative examples show the effectiveness of the method.


Introduction
For many theoretical and practical applications one needs to extend the definitions of functions defined on the real axis to the set of matrices.The history of the subject goes back to the second half of the nineteenth century when Cayley, who is the main instigator of the modern notation and terminology, introduced the concept of the square-root of a matrix A [1].Since then a huge work has been devoted to the definitions, numerical computations and practical applications of the matrix functions.A rather detailed history (including a large reference list) and important results (especially those concerning the numerical computation techniques) are extensively discussed in the book by Higham [2].So, we eschew here of making a review of the historical development and giving a large reference list.
To extend the definition of a scalar function g(t), defined for , to the set of matrices, one starts from an explicit expression of g(t), which can be continued analytically into the complex plane C , and replaces there t by A. If the result is meaningful as a matrix, then it is defined to be g (A).Before going into further detail, it is worthwhile to clarify the meaning of the word "defined" appearing in the expression of "defined for Th ".It is especially important when g(t) consists of a multi-valued inverse function.To this end consider, for example, the square-root function g(t) = t 1/2 .Its definition requires, first of all, a cut connecting the branch point t = 0 to the other branch point t =  in the complex plane C  .
en, by choosing one of its possible values at a given point, for example g (1), one defines it completely.The result consists of a well-defined branch of the square-root function.If one replaces t in this expression by A, then one gets a (unique) matrix to be denoted by A 1/2 .This matrix satisfies the equation X 2 = A which may have many solutions denoted also by A 1/2 .The above-mentioned function g(A), which consists merely of the extension of the above-mentioned well-defined branch of the square-root function, can not permit us to find all these solutions.For example the equation X 2 = I, where I denotes the unit 2 × 2 matrix, has infinitely many solutions given by cos sin sin cos where  stands for any complex angle.All these matrices are defined to be the square-root I .But the abovementioned matrix g(A) gives only one of them, namely I or (−I).
The known classical methods used in this context are grouped as follows (see [2], Sections 1, 2): 1) Methods based on the Jordan canonical formula; 2) Methods based on the Hermite interpolation formula; 3) Methods based on the Cauchy integral formula.All these methods are applicable when the function g(t), defined on the real axis, can be analytically continued into a domain of the complex-plane, which involves the spectrum of the matrix A (see def. 1.2 and def.1.4 in [2]).Consider, for example, the Heaviside unit step function H(t) defined on the real axis by It is obvious that the analytical continuation of H(t) into the complex z-plane, if it is exists, has the point z = 0 as a singular point.To reveal H(z), let us try to find its Taylor expansion about any point a > 0. This expansion is valid in the circle with center at the point z = a and radius equal to r = a.Since all the coefficients except the first one are equal to naught, one gets H(z)  1 at all points inside the circle in question.By letting a   one concludes that H(z) is regular in the right half-plane z > 0. If the above-mentioned Taylor expansion were made about a point a < 0, then one would get H(z)  0 for all z with z < 0. This shows that H(z) is a sectionally regular (holomorphic) function (see [3], Section 2.15).On the basis of the Plemelj-Sokhotiskii formulas (see [3], Section 2.17), for the points on the imaginary axis one writes Notice that (1a) and (1b) can also be obtained by computing the improper integral where the bar on the integral sign stands for the Cauchy principal value.
From (1b) one concludes that the seemingly general and elegant method 3), which is based on the Cauchy integral where C stands for a closed contour such that the domain bounded by C involves all the eigen-values of A and the function g(z) is regular there, can not be applicable to find H(A) (and other functions expressible through H(t)) when A has eigen-values having both positive and negative real parts.
As to the methods 1) and 2), they need, in general, some tedious and cumbersome computations if A has multiple eigen-values.
In the present note we will consider the case when the function g(t), defined on the real axis, is of the exponential order at both t = + and t = −, and give a new method which seems to be more logical and effective especially when the matrix A has multiple eigen-values.It gives the result as a linear combination of n matrices determined only by the matrix A. To this end we consider the Laplace transforms of g(t) on the right and left halves of the real axis, namely: and write [4]  , .
ts ts If the orders of the function g(t) for t   and t  (−) are c + and c − , respectively, then the function and the integration path L + appearing in (3c) consists of any vertical straight-line located in this halfplane (see Figure 1).Similarly, and the integration path L − is any vertical straight-line in this half-plane (if c + < c − , then one can assume L    ).Furthermore, if g(t) as well as its derivatives up to the order (m-1) are all naught at t = 0, i.e. when then one has and , .
It is worthwhile to remark here that the formula (4d) permits us to compute     m g t only at points on the real axis although g(t) and its derivatives are defined in (or can be continued analytically into) the complex t-plane.Therefore, when the point t is replaced by a complex C    , in what follows we will replace the left hand side of (4d) by the analytical continuation of     m g t to the point  and write .
The formulas (3c), (4d) and (4e) will be the basis of our approach.
Let A be a square matrix of dimensions n  n.We will define g(A) by replacing t in (3c) by A, namely: Thus the computation of g(A) becomes reduced to the computation of exp{At}.As we will see later on, the latter consists of a linear combination of certain constant matrices  j ( = order of A).Hence g(A) will also be a linear combination of these  j 's for every g(t).It is important to notice that to compute the coefficients in the combinations in question we will never need to compute the transform functions t is known at the eigen-values of A (see the examples to be given in Section 4).These points constitute the essential properties of the definition (5): 1) It unifies the definition of g(A) for all functions g(t) of exponential order; 2) It gives an expression of g(A) in terms of certain matrices which take place in the expression of exp(At) and are determined only by A; 3) It reduces the computation of g(A) to the computation of exp(At) together with some scalar constants to be determined in terms of g(t) (and its derivatives when A has multiple eigen-values) at the eigen-values of A.
The details are given in the theorems that follow.

Basic Results
In what follows we will denote a square matrix A of entries a jk by A = [a jk ].Here the first and second indices show, respectively, the row and column where a jk is placed.The transpose of A will be indicated, as usual, by a super index T such as .The characteristic polynomial of A will be denoted by , respectively, one has with the matrices Theorem-2.Let A = [a jk ] be an nn matrix while g(z), defined in the complex z-plane, is regular at all the eigenvalues of A and its restriction to the real axis is of exponential order at both t = + and t = −.If all the eigenvalues of A, say 1 , , n    , are distinct, then one has Here stands for the matrix taking place in the expression of exp{At}.
, respectively, while g(z), defined in the complex z-plane, is regular at all the eigen-values of A and its restriction to the real axis is of exponential order at both t = + and t = −.Let the non-zero matrices taking place in the expression of exp{At} be .Then one has where N stands for an integer such that     N G t t g t  and its derivatives up to the order (K-1) are all naught at t = 0.
Let the characteristic polynomial of A be   Thus (9) yields by replacing there the scalar constant a by the square matrix A, namely: Here the integration line L is any vertical straight-line such that all the eigen-values of A are located in the left side of L.
It is obvious that X(t) defined as above is the unique solution to the differential equation A t under the initial condition X(0) = I.Hence, by applying the Laplace transform to this equation one gets If the eigen-values are all simple, then the integral in (12) is computed by residues and gives (6a).When some of the eigen-values are multiple, as stated in theorem-1b, the residue method gives (9) Proof of theorem-2.When the eigen-values of A are all simple, in (5) one replaces exp(As) by its expression given in (6a) and obtains It is obvious that the derivatives in (13) yields a polynomial in t of degree .Hence the final expression of exp(At) can be arranged as what is given in (6c).
If all the eigen-values are real, then (3c) reduces ( 14) to (7).When some or all of the eigen-values are complex, we replace (3c) by (4e) with m = 0 and arrive again (7).
Proof of theorem-3.Now consider the case when A has multiple non-zero eigen-values and define   G t    N t g t where the integer N will be determined appropriately later on.If in (5) one replaces g(t) by G(t) and exp{As} by (6c), then we get where and .Remark that some h ents 0, , 1 may be equal ub-index k be K nsi to naught (see vativ Let g(t) be the characteristic polynomial of A (i.e.g(t)  ) In order to show the application and effectiveness of the ider some simple One can easily check that A has a t ple eigen-value  = 2. Therefore the theorems 1b and 3 are applicable dire ties p and m  mention = 3.On the other hand ex.-2) rgest s for which one has . Then, by co dering the requirements in (4a), we will choose the integer N such that G(t) and its deri es up to the order (K-1) are all naught for t = 0.In this case all terms existing in (15b) are computed through (4d) or (4e) and give (8).

A Corollary (Cayley-Hamilton Theorem)
. Let the la f(t)).In this case all the terms taking place in (7) or (8 are equal to zero.When the eigen-values are all simple, from (7) one gets directly f(A) = 0, which is valid for both regular and singular matrices.In the case of multiple eigen-values, (8) gives f(A) = 0 if A is not singular.We remark that the Cayley-Hamilton theorem is correct for all matrices.We will use this theorem to compute the factor A −N taking place in the formula (8) (see ex.-2).

Some Illustrative Examples method, in what follows we will cons examples.
Ex.-1 As a first example consider the case where A is as follows: ri ctly for all functions of exponential order.The quanti-ed in those theorems are: which gives (see (6c) and (6d)) Since  2  0, one has K = 2 which ows that the formula (8) is applicable with 0  N  2. For example, in or   sh der to find the expressions of A and sin A (with   2), one can choose N = 0 while A , sin A , sin A , 3 sin A , arcsinA etc. needs N = 1.To compute cosA, logA, cos A , signA and arcosA one has to choose N = 2.
To check the formulas, we would like to compute first integer  2) through the formula (7) which gives Thus for n = 2 one gets Notice that by a direct multiplication of A with itself one gets the sam result.
Similarly, one gets also e       Remark that for different branches of t , 3 t and (s Section 5 ee and theo Finally let us cons e branche trigonome arcsint and ut as shown in Figure 2 into the re taking place in the formula (8) can be computed rather easily by using the Cayley-Hamilton theorem as follows: rem-4)., ider th s of the inverse tric functions g(t) = h(t) = arccost which map the t-plane c gions in the g-and h-planes shown in Figures 3 and 4 (the so-called principal branches of these functions!).
For the first function we have to choose N = 1 while the second one needs N = 2.The matrices A −1 and A −2 In this case one has which shows again that the theorems 1b and 3 are appli-Ex.-2Now consider the case where A is as follows: Since K = 0, (8) is applicable with N = 0 and g es, for example, 2 Γ and By direct multiplication of these matrices by themselves one gets are the basic properties for the original functions signt and H(t).
Similarly, one gets also Here the functions and and Thus, for the following functions one writes Here arcsint stands for the function defined in ex. ,    On the basis of the lemma, we can ange the cut line appropriately such that the values of t m at the points defined on a Riemann surface, at given p points . Then there is a Riemann surface such that on one of its sheets g(t) takes previously chosen values at the points


Before going into detail of proofs of the above-mentioned theorems, it is worthwhile to draw the attention to the fact that theorems 2 and 3 give the matrix function g(A) as a combination of the n matrices   k  Γ which appear in the expression of exp(At).They are the same (invariant) for all g(t).which is placed at the k-th row and j-th column, can be computed through the polynomial   f s as follows: Proof of theorem-1.Our basic matrix function exp{At} is defined, as usual, through the infinite series
basic properties for the original functions t and arccost.Ex.-3 Finally, we want to give an exam e with complex-valued eigen-values.To this end consider the case where a and b.In this case one -1 above.It is interesting to check that A .Inverse Power of a Matrix m (= B) which correspond to a given matrix A through the relation m = A. The next lemma and theorem concern this case.nction

  it is e 1 r
to previously chosen values.For example, Let at the first r points (1  r < p)  .We continue this process to adjust also the values at the points 2 k j , we can replace k j + 1 (k j + 1 + m) along an appropriate (spiral) curve joining the oint 0 to t =  such that at the given points  , ,

wFigure 5 .
Figure5.A cut appropriate for the particular case when r = 5 and q = 4.