Parameter Estimation in Different Enzyme Reactions

Enzyme kinetic parameters have been estimated using MATLAB software via the Wilkinson nonlinear regression technique. The MATLAB script file written to implement this technique is short and very straightforward. Several software tools are commercially available for this purpose, with many graphical user interface (GUI) features. A routine use of these packages might offer immediate satisfaction of interactive hands-on experience; but in some cases the researcher might wish to write his/her own code and compare the results for further confirmation. Today MATLAB is in use in almost all the schools and laboratories as a standard software tool. So this paper is aimed at helping enzyme researchers to make use of this powerful software for estimation of parameters. It enables the incorporation of the analytical steps behind parameter estimation in an easy-to-follow manner and furnishes better visualization.


Introduction
Over the past several years, different enzyme reactions governed by different equations have been identified in the field of biochemistry.Estimation of kinetic parameters of these enzyme reactions is one of the important problems [1]- [4].The nonlinearity in the functional expression for the reaction rate, in terms of substrate concentration and other parameters, makes the parameter estimation problem difficult.Graphical techniques such as Lineweaver-Burk's method and the use of Dixon plot to estimate parameters [5]- [7] were in vogue for some time.These methods rely on linearizing the Michaelis-Menten equation in the case of simple enzyme reactions.
For complex enzyme reactions, such techniques do not furnish reliable results.This has prompted Wilkinson to propose non-linear regression for the estimation of enzyme parameters [8].Using Wilkinson non-linear regression coupled with a modified Fibonacci search method, Atkins outlined a computer program for fitting the Hill equation directly to experimental data [9].Following this, numerous software packages have been developed [10]- [11] to perform parameter estimation interactively.Most of these packages have fixed menus and do not allow the user to incorporate possible changes to the underlying algorithm.
The goal of this article is to explain the nonlinear regression technique briefly and show four simulation examples of enzyme reactions.The programming is done in MATLAB, a registered trade mark of MathWorks, Inc. MATLAB is a high-level language and interactive environment that enables one to perform computationally intensive tasks faster than with traditional programming languages.The syntax of various functions in MATLAB makes implementing difficult, mathematical steps very straightforward.
Today, use of MATLAB has almost become a de facto standard among engineers and computer scientists both in academia and industry.Faculty and students of biochemistry can easily learn this software using numerous tutorials available on-line or introductory books such as [18]- [21].Further, it is easy for biologists to develop interaction with other engineers and scientists in the language of MATLAB.In this paper we shall discuss the basic principle behind nonlinear regression, and supplement it with simulation examples 1 .

Parameter Estimation via Nonlinear Regression
One of the challenges faced in a biochemistry laboratory while studying enzyme kinetics is to decide comprehensive levels of the substrate concentration.For example, it is hard to determine the maximum concentration, minimum concentration, and to know how best to space the intermediate levels and so on.In practice, one may have to plot the data and go through a few iterations before endorsing the data for further processing.After satisfactory acquisition of the data, which invariably carries measurement inaccuracies, enzyme parameters appearing in the governing model can be estimated using the non-linear regression technique.
Let us consider a functional relationship that expresses the reaction rate v in terms of the independent variable(s) and some parameters.For example, in the case of a simple enzyme-substrate uninhibited reaction it is given by the Michaelis-Menten equation: In Equation (1) above, the substrate concentration s is the independent variable, and max V and m K are en- zyme parameters representing the maximum reaction velocity and the Michaelis constant respectively.Numerically, m K equals s when max 0.5 v V = . In general, there may be more independent variables than s alone, which for convenience, could be put in a vector of N elements say, denote a set of P parameters.Here the superscript T denotes the transpose.This generalization would become clearer after three more examples that are discussed in this paper.The dependence of v on x and β is denoted by ( ) Let m y stand for the m th data sample.The subscript m in all occurrences except in the Michaelis constant m K indicates the m th observation.Since measurements are subject to experi- mental errors, we shall write: ( ) where the measurement error,

N
, is assumed to be a normally distributed random variable with a mean zero and a variance of 2 m σ .Given a set of measured values of Equation ( 2), the parameter estimation problem involves finding an estimate of β that reduces a measure of error, usually the sum of weighted (with the weights w m ) residual squares WRSS, denoted by ( ) Our objective is to minimize ( ) J β with respect to β.It is hard to find the minimum of ( ) J β by explicitly differentiating and solving for global minimum.Iterative methods are usually employed.Nonlinear regression is a convenient technique [8] [22] that allows us to estimate the unknown parameters by linearizing Equation (2) at 1 The MATLAB script for one of the examples is shown in Appendix I. each iteration starting from an initial guess 0 β .This is done by first expressing ( ) ; m v x β with the first two terms in its Taylor's series expansion around an estimate, initially taken as 0 β .Then the m th element in Equation (2) can be written as: is assumed to be a small increment in the parameter vector.This can be done for all the observations m = 1 to M. Now treating the vector as the residual 0 r at 0 th iteration, we can write: This process can be iterated by replacing 0 and 1 with k and k + 1 respectively.Denoting the matrix involving the derivatives in Equation ( 5) at k th iteration by G k we get, The weighted least squares solution for this can be written in terms of generalized inverse of G k [23] [24] as: ( ) where [ ] ( ) is the covariance matrix of [ ] m ε .
Equation (7) helps in estimating the parameters of a given model recursively; but is in a generalized form.For the enzyme researchers or students to be able to code this in MATLAB, they need to understand each symbol in (7) and how to express the same for a given model.In the next section four types of enzyme models are considered and expressions for various elements of β , G , and r are given as they relate to each model.In Appen- dix I, MATLAB code is provided for Example III.For other examples, the same code could be edited with minimum effort by replacing the expressions of these vectors/matrices appropriately and could be executed.With this idea in mind, in the next section, Equation ( 7) is specialized for four different models.

Model I: A Simple Uninhibited Enzyme Substrate Reaction
Let us first consider the case of a simple uninhibited enzyme substrate reaction.As mentioned earlier in the paper, s denotes the concentration of the substrate, V max the maximum reaction velocity, and K m the Michaelis constant, which equals the substrate concentration at half the maximum rate.
Governing equation (Michaelis-Menten equation): Variables and parameters: The m th element of the residue: Elements of the m th row of G matrix:

Model II: Competitive Inhibitor Enzyme Substrate Reactions
With other symbols remaining the same as in Model I, let i denote the concentration of inhibitor.The competitive reaction is governed by the Michaelis-Menten equation: Variables and parameters: The m th element of the residue: Elements of the m th row of G matrix: .
In passing, it might be mentioned that if the data points are collected for various inhibitor values, one could vectorize them into a single column.Accordingly, the value for i in Equation ( 16) should be substituted.

Model III: Allosteric Monosubstrate Enzyme
For our third model, the allosteric enzyme reaction is considered.This is governed by the Hill equation involving an empirical parameter n.
x β (17) Variables and parameters: The m th element of the residue: Elements of the m th row of G matrix:

Model IV: Allosteric Enzyme with Two Species of Ligand Sites
This is governed by two Hill equations combined as: ( ) ; Variables and parameters: [ ] The m th element of the residue: The six elements of the m th row of G matrix: In the next section four simulation examples are given based on the preceding four models.

Example I-Simple Uninhibited Reaction
Data consisting of 10 samples are simulated in case of a hypothetical enzyme with K m = 240 nM and V max = 80 mmol/min/mg substrate using MATLAB.A zero mean unit variance random number with Gaussian density is added as measurement error.The substrate and the reaction rate are shown in Table 1.We shall take ∑ as a 10 × 10 unit matrix.Thus, WRSS can be called RSS.The parameters estimated using the program written in MATLAB are: ˆ237.10 m K = , max ˆ80.24V = , and the RSS = 0.14 2 .Figure 1 shows the enzyme reaction rate versus substrate concentration in the units specified in Table 1.In order to render a better visualization, we have provided the relief of the error surface defined in Equation ( 2) with unity weights in Figure 2(a).Also, Figure 2(b) shows the contours of the same plot.Note that the contours are ever widening ellipses with the global minimum shown by an asterisk.Also plotted in this figure is the learning trajectory obtained by joining all the estimated parameters over 6 iterations, starting from a provisional guess of K m (0) = 20 and V max (0) = 25.
This simulation is repeated using the software Prism 5, a registered trademark of GraphPad, as well as the EXCEL template called anemona published in [12].It was assuring to find that the parameters estimated were identical viz., ˆ237.10 m K = , and max ˆ80.24V = .

Example II-Enzyme Competitive Inhibitor Case
As a second example, the data found in the GraphPad user manual [14] is used.It is reproduced in The parameters estimated by using MATLAB program via non-linear regression are: max ˆˆ2 37.12, 2.75, 79.94 Figure 3 shows the plot of reaction curves.
A visual display of the surface of the cost function J is not possible as it is now a function of three variables: V max , K m and K i .So an attempt is made to visualize this optimization problem in two steps.First, we note that when i = 0, Model II degenerates to Model I; as such, K m and V max are found using the data of the first and second columns of Table 2. Secondly, using these parameters in the remaining data, J is plotted in Figure 4.The global minimum is found at K i = 2.75 in conformity with the method of non-linear regression.In MATLAB, the min command obtains minimum of a given vector.Further, it may be pointed out that the modified Fibonacci search method used by Atkins in [9] also readily finds application to accomplish this.

Example III-pNPP Hydrolysis by Alkaline Phosphatase (Hill Equation)
Determination of kinetic parameters for the hydrolysis of p-nitrophenylphosphate (pNPP) by alkaline phosphatase was reported in [17] using a program called SigrafW.Figures 5(a), (b) show the familiar structure of pNPP and AP.For convenience, the data points of Table 1 in [17] are reproduced in Table 3 and are used in this example.The parameters obtained by SigrafW, and those obtained through MATLAB simulations, as well as Prism 5 of GraphPad, are shown in Table 4.
This exercise has been attempted using anemona reported in [12], but this tool has yielded absurd results.Both the MATLAB program and Prism are found to yield identical values for the parameters.
Using MATLAB we find that RSS is less and correlation is better.Figure 6 shows the reaction rate curves using both the results.We note that the curve obtained using the parameters reported in [17] does not pass through the observations as accurately as reported therein.The MATLAB script file for this representative example is given in Appendix I. Modifying the code for other examples is straightforward as mentioned earlier.

Example IV-ATP Hydrolysis by Alkaline Phosphatase (Sum of Two Hill Equations)
Again we take the data available in [17] for the ATP hydrolysis by alkaline phosphatase.This reaction is characterized by the sum of two Hill equations.Figure 5(c) depicts the ATP structure.The measured data is found Table 2. Data for competitive enzyme reaction (Model II) taken from the GraphPad user manual [14].Here i denotes the concentration of inhibitor.in Table 2 of [17] reproduced here for convenience in Table 5.

Substrate
For better numerical accuracy, all the substrate values are first up-scaled uniformly by the factor 10 8 before parameter estimation.They are later scaled down by the same factor.The parameters reported in [17], and those obtained through MATLAB program, as well as Prism 5 are shown in Table 6.Note that MATLAB and Prism 5 yield identical results except for a miniscule difference in n 2 and RSS.
The reaction rate curves are shown in Figure 7.In SigrafW reported in [17] for the purpose of a good fit, a baseline concept is used.This amounts to a form of piece-wise regression.But the work in this paper does not rely on any baseline, as we treat the reaction as a single phenomenon.Reaction rate curve, using the parameters in [17] without the baseline, is also plotted in Figure 7.It does not pass through the measured data as seen in Figure 7.

Discussion and Conclusion
Enzyme parameters can be found using graphical aids such as Lineweaver-Burk's plot, or Dixon's plot in the case of simple enzyme reactions.But for complex enzyme reactions, parameter estimation becomes computationally intensive.With the advent of software tools, this task is performed interactively on a computer.In the open literature, several software packages are reported.Almost all of them allow the user to feed the data and select the type of reaction for estimation of enzyme kinetic parameters; but they do not allow the user to write the source code for non-linear regression.This paper provides the mathematical expressions needed in non-linear regression for four types of enzyme reactions and suggests the use of MATLAB.Nonlinear regression is briefly discussed first in general terms, and then specialized to four different models.The elements of various vectors and matrices are explicitly given for the convenience of programming in MATLAB.The MATLAB script for one of the examples is given in Appendix I. MATLAB offers better visualization and the ability to write the source code.Also, this study helps us to use these algorithms in more complex cases.Although in this paper we have given analytical expressions for the G matrix, one could use numerical differentiation for performing parameter estimation.Plotting the cost surface helps in visualizing where optimal values are located in the parameter space and helps to confirm the results obtained by non-linear regression.None of the available packages depict the surface of cost function and learning trajectory of estimation process wherever possible.But using MATLAB, this could be accomplished-thus, enabling the analysts to appreciate the analytical steps involved in the non-linear regression technique.

Figure 1 .
Figure 1.Reaction rate curve for Model I.The circles are simulated measurements, solid line is the theoretical curve, stars show the model using the estimated parameters.

Figure 2 .
Figure 2. (a) Showing the relief of the error J(β) as a function of the parameters.(b)Showing the contours of the error J(β) as a function of the parameters.The global minimum is shown by an asterisk.Also shown is the learning trajectory.

Figure 3 .
Figure 3. Reaction rate curves for Model II for various values of inhibitor concentrations.Circles: i = 0, diamonds i = 3, squares i = 10, stars i = 30.The solid line is obtained with values given by prism, dots are using MATLAB.

Figure 4 .Figure 5 .
Figure 4. Plot of Cost J as a function of K i given the values of K m and V max .

Figure 6 .
Figure 6.Showing the reaction rate in case of PNPP hydrolysis through alkaline phosphatase.

Figure 7 .
Figure 7. Reaction curve for the Example IV of allosteric response in ATP hydrolysis using alkaline phosphatase.Circles are measured values.Starred curve is obtained with the parameters estimated using MATLAB.In [17] the notion of baseline is used to obtain a good fit.Dotted curve shows the reaction rate without the use of baseline.

Table 1 .
Simulated data consisting of 10 samples in case of a hypothetical enzyme with K m = 240 nM and V max = 80 mmol/min/mg substrate for enzyme reaction model I.

Table 2
for convenience.The parameters for this data as given by Prism 5 are: max ˆˆ2 36.70,2.74, 79.92

Table 4 .
[17]ous parameters for Example III compared with those reported in[17]using SigrafW as well as Prism 5. Note that the MATLAB program developed in this paper and Prism 5 yield identical results.

Table 6 .
Various Parameters in Example IV using SigrafW, MATLAB and Prism 5.