In Search of the Best Zero Coupon Yield Curve for Nairobi Securities Exchange: Interpolation Methods vs. Parametric Models

We seek to determine which yield curve construction method produces the best zero coupon yield curve (ZCYC) for Nairobi Securities Exchange (NSE). The ZCYC should be differentiable at all points and at the same time, should produce a continuous and positive forward curve at all knot points. A decreasing discount curve is also expected from the resulting ZCYC, as an indication of monotonicity. For the interpolation method, we will use an improvement of monotone preserving interpolation method on ( ) r t t , while the Nelson and Siegel [1] model is the parametric model of choice. This is because compared to other interpolation methods, the improvement of monotone preserving interpolation method on ( ) r t t produces curves with the desirable trait of differentiability, while the Nelson-Siegel [1] model is shown to produce the best-fit results for Kenyan bond data. We compare the models’ performance in terms of accuracy in pricing back the fixed-income securities. For this study, we use bond data from Central Bank of Kenya (CBK). The better of the two methods will be used for the Kenyan securities market and, consequently, the East African Securities markets.


Introduction
The objective of this paper is to establish the best yield curve model to be used in pricing of financial products at the Nairobi Securities Exchange.We compare two models so far established to be the best in their respective categories, as shown by Muthoni Onyango and Ongati [2] and Muthoni Onyango and Ongati [3].Among the Parametric Models used in construction of yield curves, Nelson and Siegel [1] model was found to be superior by Muthoni, Onyango and Ongati [3], while the improvement of monotone preserving interpolation on ( ) r t t was found to be superior of the splines method studied by Muthoni, Onyango and Ongati [2].This paper compares these two methods in terms of: 1) The accuracy of the yield curve using the set test statistics; 2) The smoothness of the curve; 3) The continuity and differentiability of the spot and forward curves; 4) Pricing back the fixed income securities; 5) The monotonicity of the yield curve, measured by the behavior of the discount curve.

Literature Survey
Many estimation methods for yield curves have appeared in literature over the years.Generally speaking, there are two distinct approaches to estimate the term structure of interest rates: the equilibrium model and the statistical techniques.
The first approach is formalized by defining state variables characterizing the state of the economy (relevant to the determination of the term structure) which are driven random processes and are related in some way to the prices of the bonds.It then uses no-arbitrage arguments to infer the dynamics of the term structure.Examples of this approach include: Vasicek [4], Dothan [5], Brennan and Schwartz [6] Cox, Ingersoll and Ross [7], and Duffie and Kan [8].
Unfortunately, in terms of the expedient assumptions about the nature of the random process driving the interest rates, the yield derived by those models have a specific functional form dependent only on a few parameters, and usually the observed yield curves exhibit more varied shapes than those justified by the equilibrium models.
In contrast to the equilibrium models, statistical techniques focusing on obtaining a continuing yield curve from cross-sectional coupon bond data based on curve fitting techniques are able to describe a richer variety of yield patterns in reality.The resulting term structure estimated from the statistical techniques can be directly put into the interest rate models such as the Hull [9], and Heath [10] models, for pricing interest rate contingent claims.Since a coupon bond can be considered as a portfolio of discount bonds with maturities dates consistent with the coupon dates, the discount bond prices can thus be extracted from the actual coupon bond prices by statistical techniques. 1 These techniques can be broadly divided into two categories: the splines (interpolation methods) and the parsimonious function forms (see Alper [11]).

Splines/Interpolation Models
Interpolation is a method of constructing new data points within the range of a discrete set of known data points (called knot points).The simplest method for interpolating between two points is by connecting them through a straight line.Some variations of linear interpolation are capable of ensuring a strictly decreasing curve of discount factors.However, all the variations of linear interpolation imply discontinuities in the forward rate curve.
In order to produce continuous forward rates curves, researchers introduced cubic methods of interpolation.An example of cubic interpolation algorithm is the cubic Hermite spline.Under cubic Hermite splines, the derivative of the data of each knot point is assumed to be known, and the interpolation function is required to be differentiable.Often, these derivatives will not be known, and will have to be estimated.One method for estimating these derivatives, described by de Boor [12] [13] as the Bessel method involves estimating the derivative though the use of a three point difference formula.
Unfortunately, all the traditional cubic methods are incapable of ensuring strictly positive forward rates, which are synonymous with non-decreasing discount factors, as shown by Hagan and West [14].Furthermore, some cubic methods have an inherent lack of locality in the sense that a local perturbation of curve input data will cause ringing and cause changes in the data far away from the perturbed data point as shown by Anderson [15]. 1 Once the discount function, ( ) P t , is defined, the spot interest rate (the pure discount bond yield) can be computed by: ( ) ( ) All variations of linear interpolations were seen to produce discontinuities in the forward rate curve, whilst all variations of cubic interpolations were seen to be incapable of ensuring strictly decreasing discount factors.Non-decreasing discount factors imply arbitrage opportunities, whilst discontinuous forward rates unacceptable from an economic perspective (unless the discontinuities occur on or around meetings of monetary authorities).
To counter this, a monotone convex interpolation method was developed, which it is claimed to be capable of ensuring a positive and (mostly) continuous forward rate curve Hagan and West [14].This method proposed by Hagan and West [14], was specifically designed to interpolate yield curve data, and involves fitting a set of quadratic polynomials to a discrete set of estimated instantaneous forward rates.The method is designed such that ( ) f t preserves the shape of the set of discrete forward rates.The monotone convex method was also seen to be capable of ensuring a strictly decreasing curve of discount factors.Unfortunately, the model depends heavily on an appropriate interpolation algorithm.In addition, it was discovered by du Preez [16] that there were specific conditions under which the interpolation function of the monotone convex interpolation would produce discontinuity of ( ) This led to the monotone preserving ( ) r t t method of interpolation, introduced by Preez [16].Essentially, this method involves applying cubic Hermite interpolation to the ( ) r t t at the knot points i.e the values of ( ) f t at the knot points, are estimated in manner which ensures positivity in ( ) f t .Constructing an interpo- lating algorithm capable of preserving the monotonicity of the discount factors, was thus sufficient for ensuring positive forward rates.
Monotone preserving ( ) r t t method is capable of ensuring a positive and continuous forward rate curve, and was designed to preserve the geometry of ( ) r t t .Monotonicity in the discount factors implies monotonicity in the ( ) r t t which is achieved by applying the work done in the field of shape preserving cubic interpolation, by authors such as de Boor [17], Carlson and Fritsch [18], Hyman [19] and Akima [20].Apart from being an improvement of monotone convex method where it ensured positive forward rates, the monotone preserving ( ) r t t method was also capable of ensuring continuity of ( ) In the study by du Preez [16], they found that the monotone preserving ( ) r t t method to perfrom slightly better in terms of stability, and continuity of ( ) f t , compared to monotone convex method.This suggests that when bootstrapping, the monotone preserving ( ) r t method could be the ideal method of interpolation.Unfor- tunately, monotone preserving method had the undesirable characteristic of not being differentiable at the knot-points.
Muthoni, Onyango and Ongati [2] introduced a new method of interpolation, which will be an improvement of monotone preserving ( ) r t t interpolation method suggested by du Preez [16].They was done by removing the non-differentiability at the knot points in the aforementioned method, which is created by use of Hyman monotonicity constraint, which enforces monotonicity.

Parsimonious Models
Parsimonious models specify parsimonious parameterizations of the discount function, spot rate or the implied forward rate.Moving from the cubic splines, Chambers [21] introduced the parsimonious function forms by considering an exponential polynomial to model the discount function.Nelson & Siegel [1] followed shortly thereafter by choosing an exponential function with four unknown parameters to model the forward rate of U.S Treasury bills.By considering the three components that make up this function, Nelson & Siegel [1] illustrated that it can be used to generate a variety of shapes for the forward rate curves and analytically solve for the spot rate.Moreover, the advantage of the classical Nelson-Siegel [1] model is that the three parameters may be interpreted as latent level, slope and curvatures factors.Diebold [22], Modena [23] and Tam & Yu [24] employed the Nelson-Siegel [1] interpolant to examine bond pricing with a dynamic latent factor approach and concluded that it was satisfactory.
Svensson [25] increased the flexibility of the original Nelson and Siegel [1] model by adding two extra parameters (Svensson [25] model) which allowed for a second "hump" in the forward rate curve.Later, Bliss [26] introduced the Extended Nelson-Siegel [1] method, which introduced a new appropriating function with five parameters by extending the model developed by Nelson & Siegel [1].Bliss suggested that a six-parameter model can produce better results for fitting the term structure with longer maturities.
The Nelson-Siegel [1] model class has linear and non-linear parameters depending on the values assumed fixed.Due to this, these models have multiple local minima making model estimation difficult.Previous studies have widely discussed the estimation of Nelson-Siegel [1] model class and they are: Bolder [27], Maria et al. [28], Gilli [29], Rezende [30], Rosadi [31], among others.
Muthoni et al. [3] estimated the Kenyan government bonds (KGBs) term structure of interest rates based on the parsimonious functions specifications , i.e. the four parameters Nelson-Siegel [1] model, the five parameters Svensson [25] model and the six parameters Rezende [30] model, known as Nelson-Siegel-Svensson model.The reason they chose the Nelson-Siegel family is that these models have substantial flexibility required to match the changing shape of the yield curve, yet they only use a few parameters.As noted by Diebold [22], they can be used to predict the future level, slope, and curvature factors for bond portfolio investments purposes.After comparing the Nelson-Siegel classes of models, Muthoni, Onyango and Ongati [3] found Nelson Siegel [1] to be the superior model.

The Nelson-Siegel [1] Model
The Nelson-Siegel [1] model sets the instantaneous forward rate at maturity m given by the solution to a second order differential equation with unequal roots as follows: where 0 m > .The model consists of four parameters: 0 1 2 , , β β β and s τ m is the time to maturity of a given bond.Equation (1) consists of three parts: A constant, an exponential decay functional and Laguerre function.0 β is independent of m and as much, 0 β is often interpreted as the level of long term interest rates.The expo- nential decay function approaches zero as m tends to infinity and 1 β as m tends to zero.The effect of 1 β is thus only felt at the short end of the curve.The Laguerre function on the other hand approaches zero as m tends to infinity, and as m tends to zero.The effect of 2 β is thus only felt in the middle section of the curve, which implies that 2 β adds a hump to the yield curve.
The spot rate functions under the model of Nelson and Siegel [1] is as follows: ( ) ( ) 1 exp 1 exp exp From Equation (2) it follows that both the spot and forward rate function reduce to 0 We would like to obtain a vector B satisfying XB Y = +  .By using ordinary least squares (OLS) estimation, we can solve B as follows: ( ) Nelson and Siegel [1] suggested the following procedure for calibrating their model: 1) Identify a set of possible values for s τ 2) For each of these s τ estimate B 3) For each of these s τ and their corresponding B's, estimate ( ) ( ) 4) The optimal λ and β are those associated with the highest value of 2 R .The above method of calibration is known as the grid-search method.Alternatively, non-linear optimization techniques can be used to solve all four parameters simultaneously.However, such networks are very sensitive to starting values, implying a high probability of obtaining local optima.The estimated parameters obtained by using the grid search method behave erratically over time, and have large variances.The problems resulting from multi-colinearity depend on the time to maturity of the securities used to calibrate the model.Diebold [22], attempted to address the multi-colinearity problem by fixing the value of s τ over time, which in some in- stances might not produce accurate results.
Due to the local-minima problem which makes model estimation difficult in the Nelson-Siegel [1] model and the inadequacy of the calibration methods used so far, we propose NLS estimation with L-BFGS-B method optimization approach.This optimization method is an extension of the limited memory BFGS method (LM-BFGS or L-BFGS) which uses simple boundaries model, according to Zhu et al. [32].
Using L-BGFS-B algorithm, we can estimate the above five parameters: , , , , , embedded in the Nelson-Siegel [1] model, and hence calculate the price of the bond using the following nonlinear constrained optimization estimation procedure based the Gauss-Newton numerical method: where i P is the price of bond i.

Improvement of Monotone Convex Interpolation on ( ) r t t
We start with a mesh of data points { } ( ) ( ) ( ) ( ) . We will use the following definitions: With ( ) a b c and i d depends on the details of the method, and are related to the values of ( ) f t and its derivatives at the node points.In general ( ) ( ) and so on (10) where in the above equation, the prime denotes the derivative of the interpolating function ( ) f t w.r.t.its ar- gument t.Moreover given i a and i b , we can express i c and i d as follows: We can use Equation ( 4) to compute the following derivative: ( ) ( ) ( ) ( ) We have: Which depends on the matrix element This calculation is tricky if we use monotone preserving splines (or any other method which enforces monotonicity where the i b s are non-differentiable functions of the j f s (which involve the min and max functions).And this is where the novelty of this paper comes in.
Let us start by recalling the formulas for the i b 's in the monotone preserving cubic spline method as defined in the Hagan-West [14].First of all, at the boundaries, we have: For the internal data, if the curve is not a monotone at i t , i.e. ( ) So that it will have a turning point there.Instead if the trend is a monotone at i t , i.e. And: The former choice is made when the curve is increasing (positive slopes), the latter when decreasing (negative slopes).Equation ( 21) represents the monotonicity constraint introduced by Hyman and based on the Fritsch-Butland algorithm.
We can now move to the monotonicity constrain.What we need first is: ( ) ( ) The final step consists of putting all the information together to compute ( ) ( ) ( ) Let us now suppose that the trend is decreasing instead, i.e.
It can be shown that for any i and j ( ) This formula solves our problem of non-differentiability found in the monotone preserving convex on ( ) r t t .

Liquidity-Weighted Function
In yield curve construction, errors are caused by two reasons: (a) curve fitting mistakes and (b) presence of liquidity premium.The errors due to curve fitting arise from the calculations and can be avoided.But the error due to the presence of liquidity premium is reflective of market conditions and one cannot ignore them.Since the reliability of the term structure estimation heavily depends on the precision of the market prices according to Subramanian [33], liquid and illiquid securities are a heterogeneous class and including them both in the term structure estimation process poses problems.Illiquid bonds are traded at a premium to compensate for their undesirable attribute in terms of a low price.Assigning equal weights to both types of errors will give undue weight to the kind of error that creeps in due to curve fitting.Subramanian [33] suggests a liquidity weighted objective function, which hypothesizes that a weighted error function (with weights based on liquidity) would lead to better estimation that equal weights to the squared errors of all securities.We therefore model the liquidity using a function with two factors: the volume of trade in a security and the number of trades in that security.
The weight of the th i security i W is given by: max max max max 1 e 1 e 1 e 1 e max max max max tanh tanh tanh tanh where i v and i n are the volume of trade and the number of trades in the th i security respectively, while max v and max n are the maximum number of trades among all the securities traded for the day respectively.As given in the Equation (30) and Equation ( 31) above, it ensures that the weights of the relative liquid securities would not be significantly different from each other.For the illiquid securities, however the weights would fall quickly as liquidity decreased.
The final error-minimizing function, which should equal to zero, is given by: ( )

Test Statistics
In academic literature, there are two distinct approaches used to indicate the term structure fitting performance.One is the flexibility of the curve (accuracy), and the other focuses on smoothness of the yield curve.Although there are numerical methods proposed to estimate the term structure, any method developed has to grapple with deciding the extent of the above trade-off.Hence it becomes a crucial issue to investigate how to reach a compromise between the flexibility and smoothness.Three simple summary statistics which can be calculated for the flexibility of the estimated yield curve are the coefficient of determination, root mean squared percentage error, and root mean squared error .These are calculated as shown in the next four subsections.

The Coefficient of Determination
where P is the mean average price of all observed bonds, ˆi B is the model price of a bond i, n the number of bonds traded and k is the number of parameters needed to be estimated.
Roughly speaking, with the same analysis in regression, we associate a high value of 2 R with a good fit of the term structure and associate a low 2  R with a poor fit.

Root Mean Squared Error (RMSE)
Denoted as the RMSE, a low value for this measure is assumed to indicate that the model is flexible, on average, and is able to fit the yield curve.

Root Mean Squared Percentage Error (RMSPE)
Denoted as the RMSPE, a low value for this measure is also assumed to indicate that the model is flexible, on average, and is able to fit the yield curve.

Testing for Smoothness
To test the smoothness of the estimated yield curve, we use a modified statistic suggested by Adams and Deventer [34] to reach the maximum smoothness for forward rate curve, and denote the smoothness (Z) for the estimated yield curve as: ( ) ( ) Ideally, the value should equal to zero.The model with the least Z value is deemed to be the best.

Data
In Kenya, nearly all bond transactions take place on the OTC market.The data used in this study was supplied by the Central Bank of Kenya.The sample period contains 417 weekly data from January 2005 to December 2012.Weekly prices (every Friday) for 108 Kenyan Government Bonds (KGBs) with original maturity dates ranged from 2 to 30 years are obtained.

Nelson-Siegel [1] Model
Table 1 lists the summary statistics of estimated parameters for the Nelson-Siegel [1] model.It is seen that all estimated values for 1 β and 2 β are negative, which indicates that the yield curves generated by this model are all positively and upward sloping without a visible hump.

The Spline Model
Table 2 lists the summary statistics of estimated parameters for the improved monotone preserving interpolation method on ( ) r t t Model.

In terms of Accuracy
A direct comparison of the two models in Table 3 appears to favor the Nelson-Siegel [1].

In Terms of Smoothness
In the academic literature, it has been observed that when comparing alternative methods of term structure fitting models, there is usually a trade-off between flexibility and smoothness.In Table 4, the spline seems to have the best fit in flexibility for fitting the term structure of KGB market.However, as shown in Table 4, the Nelson-Siegel [1] Model is superior to the spline, which shows that the Nelson-Siegel [1] results to a relatively smoother yield curve, compared to the other model.

In Terms of the Spot Curves
In Figure 1, we see the spot curve generated by the Nelson and Siegel [1] model, which shows character-   ristics of normal curve.
In Figure 2, we see that the splines produce a curve which initially is increasing and then suddenly becomes flat after time 10, at a constant yield rate of 1.40%.This does not depict normal behavior of yield curves.

In Terms of Pricing Back the Securities
In Figure 3, we see the Nelson and Siegel model [1] prices back the securities but not with as much accuracy as the spline curve does, as shown in Figure 4.

In terms of the Discount Curve
In Figure 5, we see that the Nelson and Siegel [1] model is able to produce a decreasing discount curve, which is a very important characteristic of a yield curve because it shows not only monotonicity, but is also sensible from an economic perspective.When we compare with Figure 6, we see that the discount curve is decreasing in the short to middle terms, with a little hitch at times 10 -12, 16 -28 and 29 -31, after which it starts increasing.This curve lacks monotonicity and therefore does not make much economic sense.

Conclusion and Discussion of Results
The objective of this paper was to compare the performance of the improved monotone preserving interpolation method on ( ) r t t against the Nelson & Siegel [1] in terms of several performance yard sticks.These yard sticks were: 1) accuracy, 2) smoothness, 3) the non-negativity and continuity of forward and spot curves, 4) pricing back of securities and 5) the monotonicity of the discount curve.In terms of accuracy, we see that the spline method does better than the parametric method in that it gives an accuracy of 97.37% compared to an accuracy of 96.54% by the Nelson-Siegel [1] model.We come to the same conclusion when we compare the two models using the test statistics of accuracy in that the spline does better than the parametric model.
However, when it comes to the test of smoothness, we see that Nelson-Siegel [1] model performs much better with a lower Z value of 4.9822 compared to the spline's value of 10.7467.Smoothness of the curve is extremely important in that it points towards differentiability and continuity of the curve.
When we compare the spot and the forward curves, we see that Nelson-Siegel gives us better curves (normal curves) which reflect reality, compared to the resulting splines' curve which increases with time initially, then abruptly becomes static at a given yield rate.This behavior is not realistic.We come to the same conclusion (that the parametric is better than the spline) when we compare the two in terms of pricing back the fixed income securities in that we see an almost perfect pricing back of the securities when we use the parametric model.
When we compare the two in terms of monotonicity, a very important concept in both Mathematical Finance and Economics, we see again that Nelson-Siegel [1] performs better than the spline.It results to a strictly decreasing discount curve, compared to the decreasing spline curve which starts increasing in the long term (after 25 years).
After considering all the factors at hand, we come to a conclusion that the parametric model (Nelson-Siegel [1]) is the proper model to be used in NSE, and consequently, in the East African securities Markets.

δ
is the Kronecker delta, which is equal to one if i j = and zero otherwise.Once the derivatives at the points, or equivalently the i b coefficients, are specified, everything else is fixed.In particular we are interested in computing the work will be in the calculation of the derivatives i b w.r.t.j f .
in the absence of arbitrage, we must have that

Table 2 .
Results for estimated parameters for improved monotone preserving interpolation method on ( )

Table 3 .
Summary statistics for fitting performance in terms of accuracy.

Table 4 .
Summary Statistics for fitting performance in terms of smoothness.