1. Introduction
This study is closely related to applications in the so-called “metamodeling” of differential equations, where a “proper” model of an e.g. complex biological process is replaced by its approximation which contains “most information” about the model, but which is simpler. In particular, the true parameters of the model are replaced by “the latent parameters”, which makes the model linear with respect to the latter and hence enables the usage of the (if necessary, partial) least-squares regression. This explains why this idea proved to be efficient in parameter estimation (see e.g. [1] ). This also justifies the high numerical efficiency of metamodeling, which has been widely used in statistics [2] , chemometrics [3] , biochemstry [1] , genetics [4] [5] [6] , infrared spectroscopy [7] to simplify theoretical and computational analysis of the “true” models.
Let
be a function, where
and
,
being a space of parameters and
be a given number. The kth Principal Component Transform (PCT) is a specially constructed parametrized function
of the form
. The image
is constructed to yield the minimum distance (in some sense) between
and all possible approximations of
of the form
. The distance is chosen to ensure an efficient way to estimate the deviation of
from
.
Geometrically, the parametrized function
may be regarded as a curve
in a separable Hilbert space. Then
can be inter- preted as a projection of this curve onto an
-dimensional subspace, which is chosen in such a way that the image
gives a best possible individual fit to
among all
-dimensional subspaces. As we will see in Subsection 3.1, this necessarily leads to nonlinearity of the mapping PCT.
As we will see in Subsection 3.3, discretizing the function
and its PCT yields matrices and the projections onto their first
principal compo- nents, respectively. This explains our terminology: PCT can be regarded as a functional analog of the principal component analysis (PCA) of matrices. This terminology was suggested by Prof. E. Voit in a private talk with the second author during his seminar lecture in Oslo in 2014.
All the papers cited above concentrate on efficiency of the metamodeling approach and disregard mathematical properties of PCT and their justification, which is, for instance, quite important for understanding the limitations of the method and describing the exact conditions under which the method is applicable. In particular, the convergence properties of the sequence of metamodels to the original model has not been studied in the available literature. In our paper we try to fill this gap suggesting a rigorous mathematical approach to PCT and analysis of its basic properties. More precisely, we demonstrate how the theory of compact operators in separable Hilbert spaces can be used to provide such an analysis.
The paper is organized as follows. In Section 2 we introduce the distance in the space of parametrized functions, formulate the theorem on the best indivi- dual fit in terms of PCT of functions (Subsection 2.1) and provide some examples relevant for systems biology (Subsection 2.2). In Section 3 we study mathematical properties of PCT: nonlinearity (Subsection 3.1), continuity (Subsection 3.2) and show relations of PCT and PCA via discretization of functions (Subsections 3.3 and 3.4). In Section 4 we study PCT of products of parametrized functions which are interpreted as elements of the tensor product of two or several Hilbert spaces (Subsection 4.1). We aslo show that PCT pre- serves the tensor products and therefore the product of parametrized functions (Subsection 4.2) and give some examples (Subsection 4.3). In Appendix 5 we offer short proofs of some auxiliary results used in the paper: Allahverdiev’s theorem (Subsection 5.1) and some propositions related to tensor products of linear compact operators in Hilbert spaces (Subsection 5.2).
2. The Best Individual Fit Theorem
In this section we define the distance in the space of parametrized functions and describe how best individual fits
to a given function
can be obtained using the theory of compact operators in Hilbert spaces. We also prove nonlinearity and continuity of PCT and give some specific examples.
2.1. The Distance in the Space of Parametrized Functions
Let
be a compact subset of
and
be a compact subset of
We consider the separable Hilbert spaces
and
with the standard scalar products
and the norms
.
Suppose we are given a measurable, square integrable function
, i.e.
(1)
The aim is to find a best possible approximation of
in the class
of all functions of the form
, where
and
.
To explain better the nature of topology we use in this case let us have a look at finite dimensional Hilbert, i.e. Euclidean, spaces. Let
be an
-matrix, for instance, a discretized function
where
. In this case, the best approximation
to
in the class of
-matrices of rank not greater than
is given by the first
terms in the singular value decomposition of
:
(2)
where
and
are the normalized eigenvectors of the matrix
and
is the conjugate (transpose) of a matrix
. In other words,
(3)
The matrix norm is defined as
, where
is the Euclidean norm in
.
Now we will look at arbitrary real separable Hilbert spaces which are denoted by
and
and which are equipped with the scalar products
and
and the corresponding norms
and
, respectively. Assume that
is a linear compact operator. Its norm is again defined as
.
Put
(4)
We want to find an operator
for which
. The construction of
is very close to the singular value decomposition of matrices.
Assume that
is the adjoint of
. Then the linear compact operators
are self-adjoint and positive-definite.
Let
be all positive eigen- values of the operator
, the associated normalized eigenvectors being
, respectively:
(5)
It is well-known that
can always be chosen to be orthogonal:
and for any
there is a unique set
,
and a unique
for which
and, moreover,
Now, the operator
can be represented as
(6)
where
and the convergence is understood in the sense of the norm in the space
. The truncated versions
of this representation is defined by
(7)
The following result, a short proof of which is offered in Appenix 5.1, is known as Allahverdiev’s theorem, see e.g. [8, Chapter II, p. 28]:
Theorem 1. For any linear compact operator
(8)
The functions in numerical calculations are usually replaced by their discreti- zations, which in the case of parametrized functions gives matrices. That is why, the distance in the space of the parametrized functions
should be consistent with the distance in the space of matrices, so that we can get all the advantages of the finite dimensional singular value decomposition as well as Allahverdiev’s theorem. To define the distance in the space of matrices we have to interpret matrices as linear operators between two Euclidean spaces. Analo- gously, we have to interpret parametrized functions as operators between suitable Hilbert spaces, and define the distance accordingly.
Let us therefore go back to the spaces
,
, where
, as before, is a compact subset of
and
is a compact subset of
We denote the norm in both spaces as
Consider the integral operator
(9)
Under the assumptions of the square integrability of the kernel
the operator
becomes compact and linear from the space
to the space
(see e.g. [9] , Chapter 7, p. 202]).
The distance between two square integrable parametrized functions
and
can be now defined in the following way:
(10)
where
is defined in (9) and
The norm of the linear operators acting from
to
is defined in the standard way.
Remark 1. Evidently,
(11)
for some constant
. Therefore,
-convergence of the sequence
implies the convergence in the sense of the distance dist.
Let
be the adjoint of
, so that
(12)
Now, the self-adjoint and positive-definite integral operators
(13)
can be written as follows:
(14)
and
(15)
respectively. Let, as before,
(16)
be all positive eigenvalues of the integral operator (14) associated with its normalized and mutually orthogonal eigenfunctions
, i.e.
(17)
From Theorem 1 we immediately obtain the Best Individual Fit Theorem.
Theorem 2. For a given function
satisfying (1) the best approximation of
in the class
of all functions of the form
, where
and
, is given by
(18)
where
are the normalized, mutually orthogonal eigenfunctions of the operator (14) and
. Moreover,
for all natural
.
In other words,
(19)
Remark 2. The functions
have the following properties (which we do not use in this paper):
•
for all
;
•
for all
;
•
for all
.
Definition 1.
• The kth Principal Component Transform (PCT) of the function
is defined as
(20)
• The Full Principal Component Transform of the function
is given by
(21)
We will also write
We remark that none of these transforms is uniquely defined: even if all
are all different, we have always a choice between two normalized eigenfunctions
. However, the distance between
and any
is independent of the projection we use. On the other hand, this means that the properties of PCT should be formulated with a care.
2.2. Examples of PCT
In this subsection we consider three examples which are of importance in systems biology.
Example 1. Let
(22)
Assume that
Then, using Formulas (14) and (15), we obtain the following representations of the kernels
and
(23)
(24)
Therefore the normalized eigenfunctions
can be obtained from the equation
(25)
The functions
can be alternatively found from the equations
(26)
The parametrized power function
is of crucial importance in the bioche- mical system theory, where
represents the concentration of a metabolite, while
stands for the kinetic order. In the case of several metabolites, one gets products of such power functions, which, in turn, are included into the right- hand side of the so-called “synergetic system”, see (e.g. [10] , Chapter 2, p. 51) and the references therein. The products of parametrized power functions are considered in Section 4.
Example 2. Consider the function
(27)
Assume that
Then, using Formulas (14) and (15), we obtain the following representations of the kernels
and
(28)
(29)
We denote for simplicity
(30)
and get
(31)
Therefore the normalized eigenfunctions
can be obtained from the equation
(32)
The functions
can be also obtained from the equations
(33)
The function
is often used in the neural field models, where it serves as the simplest example of the so-called “connectivity functions” describing the interactions between neurons, see e.g. [11] and the references therein.
Example 3. Consider the Hill function
(34)
Assume that
,
,
Putting
and
we obtain
(35)
and
(36)
The Hill function plays central role in the theory of gene regulatory networks, where it stands for the gene activation function,
being the gene concentra- tion and
being the activation threshold, see e.g. [12] and the references therein.
3. Some Properties of PCT
The Principal Component Transform
is not uniquely defined. That is why, we will use a special notation when comparing PCT of different func- tions, namely, we will write
if there exist coinciding versions of PCT of
and
.
3.1. PCT Is Homogeneous, But Not Additive
Theorem 3.
1.
for any
and
2. In general,
is different from
Proof.
1. The case
is trivial. We assume therefore that
. Let
and
, see (21). By definition,
are normalized, mutually orthogonal eigenfunctions of the ope- rator
and
. Let
. Then
(37)
so that
are the same for
and
. On the other hand,
and
(38)
2. Before constructing an example illustrating nonlinearity of PCT we remark that this statement, in its more precise formulation, says that there are no
versions of
,
,
, for which
Let
and the functions
satisfy
(39)
We put
(40)
To calculate PCT we observe that both operators have a 2-dimensional image in
. Using the representation
where
we reduce the operators
and
to the matrices
so that
(41)
where
and
are row and column vectors, respectively.
Matrices
and
are symmetric. Then
and
. The first eigenpairs of
and
are
and
, respectively. There- fore the best rank 1 approximations of
and
are
so that
and
which both are operators with an 1-dimensional image. However, their sum
(42)
has a 2-dimensional image, as its representation in the basis
is given by the non-singular matrix
. Therefore
cannot coincide with any version of
.
3.2. PCT Is Continuous
Let us consider a sequence of parametrized, square integrable functions
.
Theorem 4. Let
and
for some parame- trized, square integrable functions
. Then for any version
there are versions
such that
(43)
Proof. Let
,
. We define the compact linear integral operators
using the kernels
, respectively. By the definition of the dist we immediately get that
Let
be the normalized, mutually orthogonal eigenfunctions of the operator
corresponding to its first
eigenvalues
. Since
converges to the operator
in norm, we can always choose a sequence of the eigenfunctions
such that
(44)
In this case
(45)
Therefore
which implies
(46)
The above theorem can be reformulated in terms of robustness of PCT.
Corollary 1. Let
and
be a parametrized, square inte- grable function and
. Then given an
there is a
such that for every parametrized, square integrable function
the follow- ing holds true:
(47)
for some suitable versions of PCT.
3.3. Discretization of Functions
In the papers [5] [6] , which are aimed at applying the metamodeling approach to gene regulatory networks, the approximations of the parametrized sigmoidal functions are performed numerically by using discretization and SVD of the resulting matrices. The continuity of PCT, proved in the previous subsection, can now be used to justify this analysis and, in particular, the results on the number of the principal components
ensuring the prescribed precision.
In this subsection we suppose that all functions are continuous, which is sufficient for most applications. The general case is, however, unproblematic as well if we slightly adjust the approximation procedure.
Let
be a continuous function on a compact set
where
For all
is divided into
measurable subsets
:
(48)
We define the sequence of the functions
as follows:
(49)
where
is an arbitrary point in
Lemma 1. Let
be a continuous function on
. Then
(50)
provided that
as
.
Proof. The function
is continuous on the compact set
, therefore
is uniformly continuous on
. Then for all
there is
such that
(51)
On the other hand, there is a number
for which
as long as
. Let
be an arbitrary point from
. Then for any
there is
such that
. Taking now an arbitrary
we obtain
(52)
so that
, where
is the Lebesgue measure of the set
.
Hence
Corollary 2. Let
and
be a parametrized, continuous function,
be a sequence of discrete approximations satisfied the assump- tions of Lemma 1. Then for any version
there are versions
such that
Finally, we observe that if
are defined as
, where for any
and
are measurable partitions of
and
, respectively, and
, then PCT of the discrete functions
coincide with the
- truncated SVD of the matrix
. In the next subsection we provide an example of such approximation stemming from the biochemical systems theory.
3.4. Examples of Discrete Approximations
In this subsection we study the parametrized power function
defined on the interval
with the parameter values
To approximate this function we construct a matrix
as follows: we divide
into
parts:
Similarly, we divide the interval
into
parts. Every entry of the matrix
will be given by the values
:
(53)
The corresponding discretization of
will be then given by the matrix
(54)
The vectors
and
can be obtained from the singular value decompo- sition of the matrix
(55)
where the rows of the scores matrix
consists of the numbers
and the columns of the loadings matrix
are the vectors
. As an example, let us consider the case
,
,
,
. Then
(56)
so that the Expression (54) becomes
(57)
Assume now that
. This value corresponds to row
in the matrix
. We find a number
as follows:
(58)
This yields
and hence
(59)
where
are the columns in the loadings matrix
, see Figure 1.
The Figure 1 depicts the power function
vs. its PCT with 4 components;
; the error is estimated as
and the Hill function
vs. its PCT with 12 components;
; the error is estimated as
. The Figure 2 depicts the cumulative normal distribution function
vs. its PCT with 27 components and
; the error is estimated as
and the normal distribution function
vs. its PCT with 25 PCs;
; the error is estimated as
.
(a) (b)
Figure 1. (a) The power function and its PCT; (b) The Hill function and its PCT.
(a) (b)
Figure 2. (a) The cumulative normal distribution function and its PCT; (b) The normal distribution function and its PCT.
4. PCT of Products of Functions
To calculate PCT of products of parametrized functions we need to apply the theory of tensor products of Hilbert spaces and compacts operators. Appendix 5.2 includes all the necessary details we need in this section.
Below we use the following notation (where
):
•
,
are compact sets;
•
,
;
•
,
,
,
;
•
,
,
are square integrable functions and
•
so that
;
•
so that
.
4.1. Products of Parametrized Functions
Theorem 5. In the above notation:
•
,
•
Proof. We use the definition of the tensor product from Appendix 5.2.
Let
have an orthonormal basis
so that any
can be represented as
(60)
where
We prove now that the set
is an orthonormal basis in the space
. Its orthonormality follows directly from its definition. It remains therefore to check that the set of all linear combinations of the elements from
is dense in
. Indeed, the set of continuous functions, and hence the set
of polynomials
, on
is dense in
. On the other hand, the set
of polynomials of the form
spans the set
and, finally, the set
spans the set
. Thus,
spans
and we have proved that any
can be represented as the
-convergent series
(61)
for some set
satisfying
(62)
Defining
(63)
and comparing the Representation (61) with the Formula (94) proves the equality
. The equality
can be checked similarly.
Let us now prove the last formula of the theorem. First of all, we remark that the Definition (63) implies
(64)
for any
.
By the assumptions on the kernels, the operators in this equality are linear and bounded. Therefore, it is sufficient to check the equality for
(see Appendix 5.2).
(65)
due to (64). Hence
. Comparing this for- mula with the Definition (100) completes the proof of the theorem.
4.2. PCT Preserves Tensor Products
The main theoretical result of this subsection is the following theorem:
Theorem 6.
(66)
Proof. For
we have by definition
(67)
where
are normalized, mutually orthogonal eigenvectors of the operator
corresponding to the eigenvalues
and
.
Put
and
. Using the properties of the tensor product listed in Appendix 5.2 we obtain
(68)
where
(69)
This proves that
are normalized, mutually orthogonal eigenvectors of the operator
corresponding to the eigenvalues
.
On the other hand,
(70)
Therefore,
(71)
which proves the theorem.
Remark 3. Theorem 6 is only valid for the full PCT. The truncated versions of PCT are not necessarily valid, as the order of the singular values
depends on the magnitude of the eigenvales
and
.
4.3. Examples of Products of Parametrized Functions
In this subsection we describe the kernels of the integral operators related to products of parametrized functions from Subsection 0. These examples are of importance in systems biology.
Example 1. Consider the following function
(72)
Assume that
Then, using Formulas (14) and (15), we obtain the following representations of the kernels
and
Example 2. Consider the function
(73)
Assume that
Then, using Formulas (14) and (15), we obtain the following representations of the kernels
and
Example 3. For the Hill function we obtain
(74)
Assume that
Putting
and
Then, using Formu- las (14) and (15), we obtain the following representations of the kernels
and
(75)
(76)
Remark 4. The eigenfunctions of the integral operators with the kernels that are products of parametrized functions are, according to Subsection 5.2, also products of the respective eigenfunctions of the factors.
5. Conclusions
The main results of the paper can be summarized as follows. We defined the distance in the space of parameterized functions. We defined the
-th Principal Component Transform (PCT) and the Full Principal Component Transform of functions
. The kth PCT is the best approximation of the given function, i.e. it minimizes
. We proved that if the sequence of functions
converge to the continuous function
, then the sequence of the PCT of
will converge to the PCT of
. Some properties of PCT were considered. These results can also serve as theoretical background for the design of some metamodels. Using the theory of the tensor product of Hilbert spaces and compact operators we calculated the PCT of products of functions. We provided several examples of the discrete approximations and products of the parametrized functions.
We will emphasize that our study is related to systems biology. In future works we aim to investigate the problem of “sloppiness” in nonlinear models [1] and create an effective parameter estimation method for the “S-systems” ( [10] , Chapter 2, p. 51).
Acknowledgements
The work of the second author has been partially supported by the Norwegian Research Council, grant 239070.
Appendix
1. Allahverdiev’s theorem
Let
and
be two real separable Hilbert spaces, equipped with the scalar products
and
and the corresponding norms
and
, respectively. Assume that
is a linear compact operator. Its norm is
defined as
.
Put
We want to find an operator
for which
. This construction is very close to the finite dimensional singular value decomposition.
Assume that
is the adjoint of
. Then the linear compact operators
are self-adjoint and positive-definite. Let
,
be all positive eigenvalues of the operator
, the associated normalized eigenvectors being
, respectively:
(77)
It is well-known that
can always be chosen to be orthogonal:
By the Hilbert-Schmidt theorem, for any
there is a
unique set
,
and a unique
for which
and, moreover,
Thus, the operator
can be represented as
(78)
where
, and the convergence is understood in the sense of the norm in the space
. We define the linear bounded operators
by
(79)
The following result is known as Allahverdiev’s theorem, see e.g. [8]:
Proposition 7. For any linear compact operator
(80)
Proof. First of all, we prove that
. By definition,
(81)
From (79) and (78) we get
(82)
We calculate the norm of
using (81), (82):
(83)
because
(84)
and
(85)
As
,
(
) and
, we obtain
. As
for all
,
(86)
if
and
.
Hence,
(87)
Secondly, we prove that
(88)
Let
be a basis in
. Then there exist some
from H such that
(89)
We want to prove that
(90)
If
then
If
then
Therefore
(91)
This homogeneous system has
unknowns and
equations, so that there is
such that
and
. Therefore
(92)
as
for
2. Tensor product of operators in Hilbert spaces
Let
and
be real separable Hilbert spaces, where
•
has an orthonormal basis
•
has an orthonormal basis
•
has an orthonormal basis
•
has an orthonormal basis
Let
(93)
Now, we define the tensor product
of the spaces
and
as the real separable Hilbert space, which has the basis
consisting of all ordered pairs
, and we put
By definition, any
can be uniquely represented as
(94)
Definition 2. The scalar product
in
is defined as
(95)
where
.
Evidently, the set
is an orthonormal basis of the space
and therefore
(96)
is the norm on
. The series
converges in this norm. It is also straightforward to check that
(97)
for all
,
.
Let us consider two compact linear operators
(98)
For all
we have
(99)
We define the tensor product
of
and
as
(100)
where
is given by (94).
Proposition 8. If
are linear compact ope- rators, then so is the operator
.
Proof. Linearity of
follows directly from the definition. Taking an arbitrary
satisfying (94) we obtain
(101)
Therefore
is bounded, and in particular,
(102)
To prove compactness we choose an arbitrary
and linear bounded finite dimensional operators
for which
.
Evidently,
(103)
Using (102) we obtain
(104)
Therefore, the operator
can be approximated in norm by finite dimensional operators of the form
with an arbitrary precision. Thus,
is compact.
Proposition 9. For all linear compact operators
and
we have
(105)
Proof. The set of linear combinations
is dense in
, i.e. for all
there is a sequence of linear combinations of
which converges to
in the norm. As the operators
and
are linear and bounded, it is sufficient to prove the equality in the lemma for the special case of
, where we by definition have the formula
(106)
Let
. where
and
. Then
(107)
Hence
.
Proposition 10. If
is the eigenpair of the operator
(
), then
is the eigenpair of the operator
.
Proof.
(108)

Submit or recommend next manuscript to SCIRP and we will provide best service for you:
Accepting pre-submission inquiries through Email, Facebook, LinkedIn, Twitter, etc.
A wide selection of journals (inclusive of 9 subjects, more than 200 journals)
Providing 24-hour high-quality service
User-friendly online submission system
Fair and swift peer-review system
Efficient typesetting and proofreading procedure
Display of the result of downloads and visits, as well as the number of cited articles
Maximum dissemination of your research work
Submit your manuscript at: http://papersubmission.scirp.org/
Or contact am@scirp.org