High-Dimensional Regression on Sparse Grids Applied to Pricing Moving Window Asian Options

The pricing of moving window Asian option with an early exercise feature is considered a challenging problem in option pricing. The computational challenge lies in the unknown optimal exercise strategy and in the high dimensionality required for approximating the early exercise boundary. We use sparse grid basis functions in the Least Squares Monte Carlo approach to solve this “curse of dimensionality” problem. The resulting algorithm provides a general and convergent method for pricing moving window Asian options. The sparse grid technique presented in this paper can be generalized to pricing other high-dimensional, early-exercisable derivatives.


Introduction
Methods for pricing a large variety of exotic options have been developed in the past decades.Still, the pricing of high dimensional American-style moving average options remains a challenging task.The price of this type of options depends on the full path of the underlying, not only at the final exercise date but also during the whole period of exercisable times.We consider in this paper the case of an early-exercisable floating-strike moving window Asian option (MWAO) with discrete observations for the computation of the exercise value.The exercise value of the MWAO depends on a moving average of the underlying stock over a period of time.
Carriere [1] first introduces the simulation-based method for solving American-type option valuation problems.A similar but simpler method is presented by Longstaff and Schwartz [2].Their method is known as the Least Squares Monte Carlo (LSM) method.It uses the least-squares regression method to determine the optimal exercise strategy.Longstaff and Schwartz also use their LSM method to price an American-Bermuda-Asian option that can be exercised on a specific set of dates after an initial lockout period.Their American-Ber-muda-Asian option has an arithmetic average of stock prices as the underlying.The pricing problem can be reduced to two dimensions after introducing another variable in the partial differential equation (PDE) to represent the arithmetic average.
The dimension reduction technique as in Longstaff-Schwartz [2] can not be applied for the pricing problem of MWAOs.Since moving averages shift up and down when the underlying prices shift up and down especially when the first observation in the moving window drops out and a new one comes in, the whole history of stock prices is important in determining the optimal exercise strategy of MWAOs.This leads to an arbitrary number of dimensions and presents a computational challenge.Pricing methods for MWAOs have been described by very few authors besides Broadie and Cao [3].Broadie and Cao price a fixed strike MWAO, using polynomials of underlying asset price and arithmetic average as the regression basis function.Bilger [4] applies the LSM method to price MWAOs.He uses a different choice of basis functions (i.e. the underlying asset S and a set of averages) for evaluating the conditional expected option value.Kao and Lyuu [5] present results for moving average-type options traded in the Taiwan market.Their method is based on the binomial tree model and they include up to 6 discrete observations in the averaging period for their numerical examples.Bernhart et al. [6] use a truncated Laguerre series expansion to reduce the infinite dimensional dynamics of a moving average process to a finite dimensional approximation and then apply the LSM algorithm to price the finite-dimensioned moving average American-type options.Their numerical implementations can handle dimensions up to 8, beyond that their method becomes infeasible.Dai et al. [7] use a forward shooting grid method to price European and American-style moving average barrier options.The window lengths in their numerical examples range from three or four days to two or three months.
In this paper, we apply an alternative type of basis functions-the sparse grid basis functions-to the simulation-based LSM approach for pricing American-style MWAOs.The sparse grid technique overcomes the low-dimension limit associated with full grid discretizations and achieves reasonable accuracy for approximating high-dimensional problems.Instead of using a predetermined set of basis functions in the least squares regressions, the sparse grid basis functions are adaptive to the data-it is more general and considers as many information in the moving window as possible.Using numerical examples, we demonstrate the convergence of the pricing algorithms for MWAOs for different numbers of Monte-Carlo paths, different sparse grid levels and a fixed length of observation period of 10 days.Sparse grid is a discretization technique that is designed to circumvent the "curse of dimensionality" problem in standard grid-based methods for approximating a function.The idea of sparse grid was originally discovered by Smolyak [8] and was rediscovered by Zenger [9] for PDE solutions in 1990.Since then, it has been applied to many different topics, such as integration [10,11] or Fast Fourier Transform (FFT) [12].Sparse grids have also been used for finite element PDE solutions by Bungartz [13], interpolation by Bathelmann et al. [14], clustering by Garcke et al. [15], and PDE option pricing by Reisinger [16].
The structure of this paper is as follows: first, we formulate the pricing problem of a moving window Asian option and explain why this problem is computationally challenging.This is followed by a brief description of the LSM approach and the sparse grid technique.Finally, we provide some numerical examples for pricing MWAOs with discretely sampled observations using LSM with sparse grid type basis functions.
Throughout this paper, we consider equity options on a single underlying stock in the Black Scholes [17] framework.

Moving Window Asian Option
MWAO is an American-style option that makes use of the moving average dynamics of stock prices.Similar to an American option which pays the difference between the current underlying price and a fixed strike, a MWAO pays the difference between current stock price and the floating moving average or the difference between a floating moving average and a fixed strike.

Continuous Time Version
Before going into the details of an MWAO, we set up the process for the underlying stock.
The following American constraint sets a minimum value for the function .The constraint has to be satisfied at each time 0 V t t > w t  , where 0 denotes the setup time of the option and denotes a fixed window length where   A in turn determines the MWAO payoff in (3), there are infinitely many prices involved in the computation of an optimal exercise strategy.This is a challenging infinite-dimensional problem in continuous time [6,7].

Discretizations
To implement the pricing problem for the MWAO, we consider the finite dimensional case with discretely sampled observations.Define a set of times where the weight function assigns a zero weight to an initial observation in the moving window and a weight of one to the rest of the observations.With this weight function we have effectively used the past samples to form the moving window.The above condition holds for , after an initial allowance for the window length.The dimension of this MWAO problem equals to the number of discrete samples used in the averaging window.
Our method for valuing the MWAO uses the discretized process and a quadrature of  .The valuation proceeds backwards in time, starting at the option maturity , where condition (7) holds with equality.Then we solve for the option value at current time . For short window length and thus low dimensionality, this procedure can be reformulated in a PDE setting and solved numerically.However, due to the "curse of dimensionality", the PDE method is ineffective for di-mensions of more than three or four.For window length larger than four, the high dimensional problem has to be solved using approximate representations or special numerical techniques.m m

Numerical Procedure
The previous sections provided the mathematical formulations and discussed the discretization issues related to the MWAO.This section details the numerical methods we use for pricing MWAOs.The algorithm proposed by this paper is effectively a combination of three techniques that are well established in their respective fields.The three techniques that we use as a practical tool for valuing a MWAO are Monte Carlo simulation, least squares regression and sparse grids.Especially in quantitative finance, sparse grids technique has not yet lived to its full potential.This paper contributes to use sparse grids in solving high-dimensional problems.Since all the three techniques have been documented in full detail by the cited sources, we summarize in the following the main aspects of each technique.Without explicitly mentioning it, all prices in our computations are discounted prices, meaning that prices are already normalized by the bank account numeraire.We use , and V to denote discounted stock prices, discounted payoffs and discounted option values.
S  P  

Monte Carlo Simulation
A standard method that is used when dimensionality causes numerical difficulties is the Monte Carlo simulation method.This method alone does not resolve our issue, but provides the framework for our algorithm.We assume that the stock price underlying a MWAO follows the GBM process defined in (1).The discretized stock price process is sampled at the set of discrete times i t   so that each of the realizations j S  ,

 
1 with 1 1, , j  s s denoting the number of Monte-Carlo paths, has the following normalized representation where The price of the MWAO is the expected value of the (discounted) payoff at the optimal stopping time.The optimal stopping time provides a strategy that maximizes the option value without using any information about future stock prices.

Least Squares
At each exercise time i t , the option holder decides whether to exercise the option and get the payoff   or to continue holding the option.In order to Open Access OJS maximize the option value at time , the holder exercises if where denotes the expectation taken under the risk neutral measure.In the LSM approach, the value of The value of

 
, is computed using a least square regression on many path-realizations The regressions start at one time step before the maturity .
, 1, , The function is a linear combination of the basis functions The coefficients are found by minimizing the -norm where i j t V  is the option value of a Monte Carlo path realization j S  at time i .The option value i t j t V  is given as the maximum between the estimated continuation value and the intrinsic value.A numerically more stable algorithm is to set where the computed is used only in earlyexercise decisions, this avoids the accumulation of approximation errors when stepping backwards in time.
 , time, a backward induction dynamic programming method solves for all values i j t V  , starting at time and T iterating back to .Based on the values 0 , we compute an estimated option value, known as the in-sample price This approach has an obvious shortcoming.Each of the estimated option values 0 contains information about its future stock path .In order to avoid this perfect foresight bias, we compute an out-of-sample option price: we generate additional simulation paths , but use the coefficients fitted to the old set of simulation paths s .Consequently, the out-of-sample value does not depend on knowledge of the future paths.The out-of-sample option value is computed by In our implementations, we compute only the out-ofsample value since it is the value for which we can state the optimal exercise policy without information about the future.The expected value of the out-of-sample price is always a lower bound for the option value, and the estimate is crucial for the convergence of the least squares Monte Carlo simulation method.We are confined to finitely many samples and to finite degrees of freedom in the regressions, thus are not able to perfectly represent the real shape of  using the estimate .A less than optimal exercise strategy is performed and provides a lower biased option value.

Basis Functions
An important issue in the LSM approach is a careful choice of the basis functions k  in (11).We will use in this paper a linear combination of the sparse grid type basis functions to approximate the conditional expected value  involved in the optimal exercise rules.We need one dimension for each observation in the averaging window, this leads to a high dimensionality in the computational problem.Sparse grid [8] is a discretization technique designed to circumvent this "curse of dimensionality" problem.It gives a more efficient selection of basis functions.This technique has been successfully applied in the field of high-dimensional function approximations [15] and many others [10,13,16,18].
In the following we provide a brief description of the sparse grid approach.We start from constructing onedimensional basis functions in a general case and show how to build multi-dimensional basis functions from the one-dimensional ones.Next we create a finite set of basis functions for numerical computations.Sparse grid then efficiently combines the sets of basis functions in a way such that the resulting function set is linearly independent.Following this, we detail on two specific types of sparse basis functions -a polynomial function and a piecewise linear function -to be used as the basis functions in the LSM regressions in this paper.

Constructing the Basis Functions
When approximating a function with simpler functions or numerically extracting the shape of a function, it is com-mon to build such a function representation using some basis functions.We consider here a set of basis functions , , , n       and we call the set (sets) of basis functions "function basis (bases)".From the set  , we construct an approximating function f  as a linear combination of the basis functions with coefficients k for a , 1, , k k n    , and x X  for some set .The basis functions  can be a one-dimensional or multi-dimensional mapping from to .
X R For the one-dimensional case, many function bases are well known and widely used.Examples include polynomials, splines, B-splines, Bessel functions, trigonometric functions, and so on.For the multi-dimensional case, the set of basis functions that are commonly used is more scarce.There are two common approaches to constructing multi-dimensional basis functions from the one-dimensional ones: the radial basis functions [19] and the tensor product functions [20].In this paper we focus on the tensor product approach.To construct a tensor product basis function, we select one-dimensional functions  and multiply their respective function value evaluated at the corresponding component of x .Specifically, for and each m-dimensional basis function , , , then multiply the -th function , k j j  evaluated at the -th element j j x , for , to have the following representation of an -dimensional basis function .

Creating a Function Basis
Having created multi-dimensional basis functions in a general case, we now select a finite set of these functions to be our basis for numerical analyses.Since we will do computations on different levels of accuracy, we will also need a set of basis functions on different levels.Naturally, the more basis functions we put in the set, the more accurate are our function approximations.
We decide that on a level there are , , , , are one-dimensional basis functions, and is a monotonously increasing function that determines the size of our basis at level .L In order to create an -dimensional function basis, we choose a level . For each dimension , , , 19).Using the tensor product function, the -dimensional function basis m L  is constructed as , 1, , where denotes any of the one-dimensional basis functions from j L  , evaluated at the j t element of h , , , m x x x x   .It was our original goal to construct a function basis with increasing expressiveness for increasing levels.This multi-dimensional level is not a very good starting point.Consider the case where all dimensions are equally important, we would have to use a level of the form The size of the resulting function set would be extremely large, a phenomenon known as the "curse of dimensionality" problem: Considering a simple example with dimensions on level . For , there are three possible combi-nations of the levels 1 and 2 that sum to 2: and   2, 0 .From this, we could create a set as the union of the tensor function bases: .
The first set    on a top level, we now create two specific types of sparse basis functions that will be used as the function sets in our LSM regression problem: a polynomial sparse function basis and a piecewise linear function basis.Compared to the piecewise linear basis functions, the polynomial sparse basis functions are easier to understand and easier to implement.As solutions to higher dimensional problems, the piecewise linear basis functions are more adaptive.They can be extended to effectively place the basis functions on the dimension that contributes more to the problem solution.As a result, piecewise linear basis functions have seen wide applicability to solving PDEs [18] and interpolating functions [21].In our paper, we use these two types of sparse basis functions to cross validate the results of our high dimensional pricing problem.

A Polynomial Sparse Basis
A polynomial function basis with basis functions in the one-dimensional case has the following construction From the one-dimensional functions L  , we can build a multi-dimensional tensor basis according to (20).
As an example, for the two-dimensional case   2 m  , we have the first dimension in x with one-dimensional functions Equation (20) then gives the following list of tensor basis functions Building upon the two-dimensional function basis , L L , we construct a sparse basis function set for the three sparse levels  , L L  according to (22).As an example, the sparse set The set   .While the polynomial sparse basis functions aim for a global fit, the piecewise linear basis functions fit locally to the approximated functions and it is to the task of constructing the piecewise linear basis functions that we now turn.
Open Access OJS

Piecewise Linear Functions on [0,1]
It has been found computationally advisable to have only one basis function on the first level.Hence we start with the constant 1 on level and transform * 0 L   on successive levels.This construction avoids the inclusion of costly boundary points by creating boundary functions that are less scaled than inner functions.Klimke and Wohlmuth [21] is a good reference on piecewise linear basis functions.
The piecewise linear function is a type of basis function that is commonly used in sparse grid applications.To create a piecewise basis for various levels, we utilize a construction approach known from multi-resolution analysis.We define a mother function   x  and generate our basis by scaling and translating from   The one-dimensional function bases of levels 1 0 L  ,  in x as generated from the one-dimensional basis functions  in (26) are the following sets 4 , 4 4 8 3 , 8 5 where the series 28) is a sequence of odd numbers.The one-dimensional function bases in can be analogously generated from (26).y The construction of a two-dimensional sparse basis in the x and directions respectively for levels y * 0 L  , L * = 1 and * L 2  , where , follows from ( 22) As an example, the sparse basis function set * 2 piece L   is created by taking the unions of the function bases of the tensor product of and , with given in ( 27) and 2 0 . The set is constructed using the tensor product function (20).


The two-dimensional sparse bases can be extended to higher dimensions depending on the problem dimensions we intend to solve.The resulting higher-dimensional function sets can then be used in the LSM regressions by selecting an optimal sparse level for that problem dimension.

Implementations
We perform the regressions required by (12) and the approximating function In our implementations, we use sparse levels from up to 3 and dimensions for computing the basis functions.This is sufficient for our purposes.But, we do not perform the regressions on directly.Instead, we use scaled values of such that for each path , we compute where   1 , , . For discretely sampled observations, the dimension of the problem is effectively dimensions because of the weight function m  used in computing the moving averages.When extrapolating to the continuous case, the dimension of the problem approaches infinity as goes to .m  The regression itself is performed solving the least squares problem of ( 12) via QR-decomposition.Furthermore, the regression is only performed on paths with a positive exercise value .This significantly decreases the computational effort.

Numerical Example
We provide in this section a numerical case study of using sparse grid basis functions in LSM pricing MWAOs.We will use a discretely sampled averaging window spanning ten observations with the weight function α in (6).The properties of the MWAO are defined in Table 1.The underlying stock prices are sampled at a regular frequency, e.g.every trading day at a specific time.
To analyze the convergence of our pricing algorithm for MWAOs, we use the notation      This allows for a more sophisticated strategy with a better utilization of the option.After about 300,000 simulations, the option value saturates at 7.62 for both polynomial and piecewise linear basis functions.The sparse level with 241 basis functions results in similar values after 1,000,000 simulations.For the polynomial sparse basis functions, a third level 100 300 1000 3000 10,000 30,000 100,000 300,000 1,000,000 number of simulations 6.8 6.9   thing worth mentioning is that higher sparse levels initially perform inferior to lower sparse levels due to the over-fitted regression functions.The corresponding values in Figure 1 are presented in Table 2 for polynomial sparse basis functions and in Table 3 for piecewise linear sparse basis functions.The mean values of a series of valuations ,

Conclusion
This paper presents a general and convergent algorithm for pricing early-exercisable moving window Asian options.The computational difficulty of the pricing task stems from what defines the option's underlying: an either discretely or continuously sampled average over a moving window.We have applied a generalized framework to solve this high-dimensional problem by combining the least-squares Monte Carlo method with the      sparse grid basis functions.The sparse grid technique has been specifically developed as a cure to the "curse of dimensionality" problem.It allows more efficient selection of basis functions and can successfully approximate high-dimensional functions with less computational effort.We have used both the polynomial and the piecewise linear sparse basis functions in the least-squares regressions and found that the results converge to values up to two decimal points, independent of the type of basis functions used.We recommend the polynomial sparse basis as the basis of choice for this type of pricing problems since they are easier to construct and easier to implement.The approach presented in this paper can be generalized to pricing other high-dimensional early-exercisable derivatives that use a moving average as the underlying.The stock prices and numeraire are first simulated in the external models "StateProcesses", then imported as processes into the exercise model "MWAOExercise Values" to compute future early-exercise values.The early-exercise cash flows are next imported as a process into the pricing model "MWAOPrice" to determine the MWAO price.
The ThetaML future operator "!" appears in the model "MWAOPrice".It allows the possibility to use the variable "value" at the model time when its values are not pre-assigned.Computationally, whenever the compiler encounters the future operator "!", it evaluates the codes backward in time.So, when computing the time 0 option price, the compiler goes from the option maturity back to time 0, and assigns the computed time 0 value to "Price".
Sparse grids were developed as an escape from the "curse of dimensionality" problem.They allow reasonably accurate approximations in high dimensions at low computational cost.The idea behind the sparse grid is to combine the tensor basis of different levels.Let * L  be the set of basis functions on level .We define it as the union of all tensor function bases * ber of basis functions.Using(22) to unite the function sets, our sparse bases for the two dimensional function space are the following sets the unions of the tensor bases of   1 2 . Higher dimensional function bases can be similarly constructed using the tensor product approach, depending on the dimension of the problem we intend to solve.In our LSM regression problems, we have for


3.3.5,respectively.As an example, in the case of sparse polynomial basis functions of dimensions, the approximating function of (11) is a linear combination of the basis functions in the polynomial sparse For the sparse piecewise linear basis functions of 2 m  dimensions, the approximating function is a linear combination of the basis functions in the piecewise linear sparse basis set e P  * piece L  in (29).At sparse level *


 fixed in order to get an estimate for the mean is based on a randomly generated

1 L
Figure 1 presents the mean

Figure 1 .
Figure 1.The option values for MWAO.The figure shows MWAO values estimated by least squares Monte Carlo combined respectively with polynomial sparse basis functions and with piecewise linear sparse basis functions.The sparse level ranges from 0 to 3 and the number of simulation paths runs from to .In the figure, "polynomial basis level #" refers to the MWAO values estimated with the polynomial basis functions at a sparse level #, where # is the number for the sparse level.Similarly, "piecewise basis level #" points to MWAO values estimated with the piecewise linear basis functions at the sparse level #.The data for the MWAO are specified in Table1.
Figure 1.The option values for MWAO.The figure shows MWAO values estimated by least squares Monte Carlo combined respectively with polynomial sparse basis functions and with piecewise linear sparse basis functions.The sparse level ranges from 0 to 3 and the number of simulation paths runs from to .In the figure, "polynomial basis level #" refers to the MWAO values estimated with the polynomial basis functions at a sparse level #, where # is the number for the sparse level.Similarly, "piecewise basis level #" points to MWAO values estimated with the piecewise linear basis functions at the sparse level #.The data for the MWAO are specified in Table1.

1 LL
), the standard deviation of the series across I valuations is denoted by  .For a single evaluation with LSM, seen as a measure of how close the value is to the mean of I valuations.Thus  does not measure the error relative to the true option value.The mean estimate will be biased lower than the true values due to the suboptimal estimate of the optimal sparse level both polynomial and piecewise linear basis functions deliver similar MWAO prices up to two decimal points after sample paths.At sparse level the MWAO prices are the same up to three decimal point in both cases.Based on the results, both types of sparse basis functions solve our high-dimensional least squares problem and the prices converge to a level of at sample  and sparse level .Since the polynomial sparse basis functions are easier to construct and easier to implement, we recommend them as the bases of choice for our MWAO pricing problem.The price of our moving average window option has three main sources of error: the number of simulation paths , the level of the function basis and the number of integration samples .The best possible approximation would have to reduce the errors originated from using these limiting parameters.n * m

Table 3 . The option value for MWAO with data in Table 1 estimated by least squares Monte Carlo with piecewise linear sparse basis functions. The mean estimate
 , ,