Projection methods and the curse of

We study the ability of three different projection methods to solve high-dimensional state space problems: Galerkin, collocation, and least squares projection. The curse of dimensionality can be reduced substantially for both Least Squares and Galerkin projection methods through the use of monomial formulas. Least Squares are shown to require a good initial value in order to give an accurate solution. Alternatively, we suggest a new ad hoc collocation method for complete polynomials that is fast and easy to implement.


Introduction
In this paper, we study projection methods which have become a standard tool in the analysis of business cycle models and, in particular, asset prices.These models are highly non-linear and cannot be solved with standard linearization methods.For example, [1], Section 6.3.4 and [2] use projection methods to compute equity premia in general equilibrium models of the business cycle, while [1], building on the idea of [3], also apply this method in order to solve models with occasionally binding constraints, e.g. in the form of non-negative investment. 1 More recently, [4] applies projection methods to a heterogeneous-firm model where firms face aggregate uncertainty and investments are lumpy.
A considerable difficulty in these models arises from the "curse of dimensionality".The curse of dimensionality 2 most generally describes problems 1 [3] use perturbation rather than projection methods. 2   The term was originally coined by [5] in the context of dynamic optimization.
in the analysis of data that only arise in high-dimensional space.In the present context, we examine the phenomenon more narrowly in the consideration of numerical analysis (with an application to the solution of a stochastic general equilibrium model), in which the computational time rises exponentially with the state space dimension and constitutes a binding constraint even with current computer technology.
Three different kinds of this method are considered: Galerkin, Collocation, and Least Squares. 3 The three methods all try to find a good fit to a policy function of the state variable n x ∈  that is characterized by a parameter m φ ∈  .The policy function may take the form of a polynomial, for example, and the i φ are simply the coefficients of the polynomial.Furthermore, the policy function is approximated over the bounded interval , , , 1, , i n =  .For this policy function, we are able to compute the so-called residual function that characterizes our problem.The residual function ( ) , R x φ may take the form of a first-order condition or an equilibrium condition and we try to choose ϕ so that ( ) , R x φ is close to zero over the domain D. The three different methods considered in the following differ with regard to the projection step i.e. how we choose the criterion of making ( ) , R x φ close to zero.The least squares method solves an optimization problem by minimizing the sum of the squared residuals over the domain D, while the collocation method finds the solution ϕ by setting the residual ( ) , R x φ to zero at exactly m points i x , 1, , i m =  .Galerkin projection, like collocation projection, solves a non-linear equations problem.With this method, the residual is multiplied by the basis of the policy function and integrated over the domain D. The values of ϕ are chosen so that the integrals for the m basis functions are equal to zero.
While Galerkin and collocation projection methods have been analyzed and applied extensively to the solution of stochastic dynamic general equilibrium models, least squares methods have only been given little emphasis in the solution of these problems.For example, Galerkin and collocation projection methods have been demonstrated to be a very useful tool in the solution of the standard stochastic growth model. 4In particular, Chebyshev polynomials have been shown to provide very accurate approximation of the policy function in many examples. 5On the other hand, the literature has only paid little attention to the solution of the stochastic growth model or any other dynamic stochastic general equilibrium model with the help of Least Squares projection. 6This observation is somehow puzzling as a priori we would assume that it is easier to solve a minimization problem than a non-linear equations problem.In the former case, we are at least certain to find a solution, even if it may turn out to be only a local, but not a global minimum.
In the following, we analyze if projection methods can successfully be applied to higher-dimensional state space problems that arise naturally in the context of heterogeneous-agent economies.In order to solve problems that are characterized by a state space of dimension N, we need to reduce the number of coefficients m in the approximating policy function.Assume that we approximate the policy function by a polynomial function of degree 2. If we use the tensor product, the number of coefficients for a 4-dimensional state space is equal to 4 3 81 = .If we increase the dimension of the state space to 8, the number of coefficients is already equal to 8 3 6561 = .This exponential growth of the number of coefficients, of course, becomes a binding constraint on computational time.[10]   suggest to use complete polynomials instead.In this case, the number of coefficients only amounts to 15 and 45 in dimension 4 and 8, respectively.We, therefore, rather consider complete polynomials than tensor products.
If we choose complete polynomials for the approximation of the policy function, however, we run into problems with the standard collocation method, where we simply set the residual function equal to zero at a number of points that is equal to the number of coefficients (and solve a system of non-linear equations).How should we choose the points, e.g. the 15 points in dimension 4?
One possible solution is the Smolyak's algorithm presented by [11].The Smolyak algorithm is a device how to pick the collocation points optimally.
However, this algorithm suffers from its lack of universal applicability as it only works for certain combinations of the state space dimension and the degree of the complete polynomial in the approximating function.
In this paper, we propose three different projection techniques to address this problem that are also universally applicable.First, we consider the standard Galerkin projection method in Section 3. Second, we suggest a rather ad hoc collocation procedure in Section 4 that is found to perform rather well.Finally, we solve a least squares problem instead, i.e. we simply minimize the sum of the squared residuals.This method is presented in Section 5.The rest of the paper is organized as follows.In Section 2, the model is presented, and Section 6 concludes.

The Model
We study a simple social planner's problem in a N-country model.
where δ is the depreciation rate of capital and n t i is gross investment in country n at time t, and where n τ is the Negishi weight for country n.The world budget constraint is given by ( ) The productivity shocks are generated by the law of motion ( ) which implies that ( ) is the elasticity of substitution between capital and labor.
We will examine three kinds of utility functions.The first will be the separable utility function ( ) where γ is the intertemporal elasticity of substitution and η is the elasticity of labor supply.A special case of this will be the inelastic labor supply case: ( ) The second utility function we will use is the Cobb-Douglas specification ( ) ( ) (  ) where e L is labor time endowment.The third utility function we will use is the CES specification ( ) ( ) where χ is the elasticity of substitution between consumption and leisure.

Specification of Negishi Weights and Steady States
Each problem is specified in terms of the Negishi weights n τ .These are tied down in general equilibrium models by endowments.However, we will specify Negishi weights instead of endowments since this allows us to focus on solving multidimensional dynamic models.If we did want to specify endowments instead and compute general equilibrium prices, the Negishi method would be the most natural way to proceed; that is, we would compute allocations (and the implied prices) conditional on the Negishi weights and then use a finite-dimensional nonlinear equation solver to find the weights that implied equilibrium.Therefore, we focus on the dynamic problem for fixed Negishi weights.
We use a simple rule to pin down the Negishi weights.For each problem, the Negishi weights are to be chosen so that the steady state consumption in each country would equal its net output if the productivity shocks were eliminated.This is a sensible choice since it implies that net foreign asset income is small, a rough approximation of reality.
We will also assume initially that all countries are the same size.Specifically, we will assume that the steady state capital stock and labor supply both equal unity for all countries.The parameter choices made below are intended to produce that result.

Common Parameter Values
Some parameters will be fixed at one value for all examples.We must choose a common β in order for the solution to be stationary.We choose 0.99 β = so that the period of time is about a quarter.We also fix the values for α and δ since these reflect standard choices and their variations will not present significant computational challenges.The adjustment cost parameter φ covers an empirically relevant range.The stochastic parameters will represent high and moderate persistence, and high and low productivity shocks.Table 1 summarizes our parameter choice.

Galerkin Projection
In this section, we will first introduce you to the projection method and to the Galerkin method in more detail.Specifically, we will describe how we implemented the method for the computation of the model in section 2. Second, we will present accuracy results for the Galerkin projection.

The Numerical Method
With the help of projection methods, we want to approximate an unknown function : f X Y → , where X and Y are subsets of n  and m  , respectively: This function is implicitly defined by the functional equation where C and 2 C are given spaces of functions, e.g., the set of all continuously differentiable functions on [ ] , a b .Examples of functional equations are the Bellman equation or the Euler equation of the stochastic growth model.The functions may also take the form of equilibrium conditions, for example, such as supply equals demand.In the model of section 2, the functional equations are represented by the first-order conditions with respect to current-period consumption and labor supply and the next-period capital stock, respectively: ( ) For the true solution f, the residual function is equal to zero.In addition, we also normalize the residual function to one so that deviation from zero can be interpreted as percentage deviations.For example, the residual function that represents the first-order condition with respect to current-period consumption is formulated as follows: Suppose there is a set of test functions On a function space, this inner product induces a norm on this space and we choose the vector of parameters ϕ such that The three different solutions considered in this paper are derived for special choices of i g and w.
• The Galerkin solution chooses i i g ψ ≡ and 1 w ≡ .
• The collocation method uses the Dirac delta function as weight function, ( ) We summarize the general procedure that underlies projection methods in the following algorithm that is adapted from [7].

Algorithm (Projection Method)
Purpose: Approximate the solution f: X Y → of a functional equation F(f) = 0. Steps: Step 1: Choose a bounded state space n X ⊂  and a family of functions ( ) Step 2: Choose a degree of approximation p and let Step 3: Define the residual function: Step 4: Choose a projection function i g , a weight function w and compute the inner product: with respect to ϕ.
Step 5: Verify the quality of the candidate solution ϕ.If necessary, return to step 2 and increase the degree of approximation n or even return to step 1 and choose a different family of basis functions.
In step 1, the boundaries are found with some trial and error.In the following, we will concentrate on the symmetric case ( 1 a , we assumed that it does not deviate from its steady state level 1.0 by more than six times its unconditional standard deviation, ψ from the family of the Chebyshev polynomials.We report the results for a degree of approximation 2 p = . 7In the symmetric case, the residual function is one-dimensional.We obtain the residual function from the intertemporal first-order condition (11c) after having inserted the Equation (2)   for the accumulation of the capital stock and the world budget constraint (3).
Again, we normalized the residual function to one by dividing (11c) through the marginal utility of current-period consumption.
Step 4 deserves some discussions.In particular, we applied various devices that are non-standard in the literature in order 1) to make it faster and 2) to keep it from breaking down.The first adjustment is necessary in the presence of high-dimensional state spaces, while the second adjustment is found to be useful in dynamic stochastic equilibrium models in general.3) A third adjustment was necessary for the Galerkin and least squares projection due to the numerical accuracy of the hardware.1) Consider the projection step 4 where we compute a high-dimensional integral in the case of both Galerkin and least squares projection. 8As one possible way, one can apply Gaussian formulas to compute this integral, for example Gauss-Chebyshev integration.However, we find this method to be too time-consuming in the present application once we have a state space of dimension 8 or higher (even if we only pick three nodes in each dimension).For this reason, we rather compute the integral (13) with the help of a monomial formula as suggested by [7]. 9In particular, we use the following monomial formula for the cube which is taken from [12]: 7 For a linear approximation ( 1 p = ), the projection methods did not provide accurate solution for the case with 6 N ≥ countries.In particular, the dynamic behavior of the capital stock k became unstable when we simulated over many periods.8   With collocation, we only compute a finite number of points.9 [9] also applies monomial formula in his study of Galerkin projection methods.His integration step, however, differs from ours in the choice of the grid points and the integration formula.In addition, we also provide a new algorithm for the collocation projection that is efficient and fast. where By using this rule, 10 we reduce the number of computations of ( ) in the case of the Chebyshev integration with y nodes to 2 2 1 n n + + .For our problem, we find the Chebyshev integral and the monomial rule to coincide for the first four digits.
We also applied monomial rules to the computation of the expectation on the right-hand side of the first-order condition (11c).In order to compute the residual, we, again use a monomial formula from [12].If where ( ) 0, , 0,1, 0, , 0 i e =   denotes the i-th unit vector.In order to apply this rule to the general case of a random normal vector z with mean μ and covariance matrix Σ , we use the change of variable technique. 11The linear transformation ( ) E f z can be expressed as an integral function of x: This integral can be approximated by formula (15).Again, we find this formula to perform rather well compared to standard Gauss-Hermite integration.For the expectation in the first-order condition of (11c) in the symmetric benchmark case, the first four digits of the monomial formula integral (15) and the Gauss-Hermite integral with 4 nodes coincide.
2) In order to apply our integration routine ( 14), we, however, need to be careful.Assume we approximate a one-dimensional function ( ) As you can see from the formula ( 14), we also evaluate the function ( ) f x at the boundaries x a = and x b = . 12Now consider our problem introduced in section 2. In order to compute the integral over the squared residuals, we have to evaluate the residual for a capital stock k in period t at the boundaries of the approximation interval [ ] min max , k k .In the likely case that our initial guess is not 10 In order to apply the monomial formula (17), we need to transform the variables into the canonical form.This procedure is described in [7] Ch. 7.5.very close to the true solution, we might choose a next-period capital stock k′ in 1 t + that is outside the approximation interval [ ] min max , k k . We, however, also need to compute k′′ in period 2 t + in order to compute with the help of the world budget constraint (3).Therefore, we might have to compute an approximation of the policy function that lies outside the approximation interval D. Outside D, however, the approximation accuracy might deteriorate quickly so that our computation might break down because of a variable overflow or a negative value for 1 t c + .Indeed, we encountered this problem in almost all our applications.In order to circumvent the problem, we suggest the following two procedures that we also apply for the other two projections methods below.a) First choose an interval D for the approximation of the policy function.Next choose a subset 1 D D ⊂ and compute the sum of projected residuals (squared residuals) over this smaller interval.In all our applications, this device worked rather well.Typically, we choose the length of the edges of 1 D to be approximately 50% -70% of the corresponding ones in the set D.
b) Routines that solve non-linear equations, as the modified Newton-Raphson method with line search, require a good starting value.In our applications, we started with the solution from the log-linear model for the computation of the Chebyshev approximation with a degree 1 p = .For a degree 1 p > , we simply used the solution for the approximation with the degree 1 p − .Even with this values, however, the Newton-Rhapson algorithm was trying values for the vector ϕ where it is impossible to evaluate the residual function.For this reason, the routine that evaluates the residual function must return an error flag that signalizes the calling program to stop.Otherwise, the program will crash because of overflows, underflows, or other run-time errors arising from undefined numerical operations.Yet, standard software usually assumes that it is possible to evaluate a given non-linear system everywhere and there is no way to tell the program to do otherwise.Therefore, we wrote our own non-linear equations solver where the step size in the Newton-Rhapson algorithm is reduced as long as the evaluation of the residual function returns an error flag. 13 In the cases of the Galerkin and the least squares projection, we also find a third device to be very helpful.Typically in the computation of stochastic dynamic general equilibrium models, we would like to find a solution with residuals that are small and below 10 −4 or even 10 −5 .Therefore, the value of the squared residual is of the typical order 10 −8 or even less.If we integrate the projections of the (squares) residuals over the n-dimensional interval

[ ] [
] [ ] − are small, we end up computing an integral value that is close to the accuracy of our numerical software (we used FORTRAN 95 with double precision data type).Indeed, this is the case for our state space where the typical integral (13) amounts to less than 10 −16 in the symmetric 4-country benchmark case, for example.We, therefore, adjusted the scale of our sum of squared residual in (13) and normalized it to one by dividing the integral by the volume of the interval 1 D .In this regard, our procedure is equivalent to the standard OLS approach where the total weight of all squared residuals is set equal to one.

Accuracy Check
In order to evaluate the performance of the projection methods, we use two different tests: • Accuracy test 1: We compute a sample of 100 points i x at radius r from the deterministic steady state.We compute the absolute value of the residual ( ) i R x for each point in the sample.The maximum of these values for each radius r are reported in Table 2. 14 • Accuracy test 2: We simulate our solution to produce a sequence of values for the state, { } 1  , we compute the maximum and the mean of the (absolute) residual. 15ble 2 presents the results for the Galerkin projection.Naturally, the accuracy declines with increasing radius from the steady state in the accuracy test 1 (please see the first two columns of the table).Remember that we only approximated the policy function inside the radius 0.10 r = . For this region (which is also the relevant region during the simulation), the maximum deviation of the residual function deviates amounts to only 0.20%.The good performance of the algorithm is also reflected in the values of the accuracy test 2.
Even after 100 simulation, the maximum absolute value of the residual did not This is the style of error used in [6].15   This style of error was used in [13].
exceed 0.0014% and the mean was only 0.00045%. 16e runtime for the modified Newton- ).If we increase the number of countries to 6 or 8, we run into the limits for the computational time on our Pentium III, 846 MHz computer.As can be seen from Table 3, the runtime increases quickly to multiple days.Therefore, for the computer technology at our current disposal, 16 state variables (corresponding to 8 N = ) has been found to be the limit.

Collocation Projection
In the collocation projection, we use the Dirac delta function as projection function with a weight identical to one.Therefore, we only compute the projection at single points and circumvent the computation of integrals (except for the computation of the expectation).Of course, this will save a lot of computer time if the state space is large.Instead, the task is to solve the non-linear equation system This particular projection method is called Chebyshev collocation.
The procedure is easy to apply in small dimensional space where we can use the tensor product of the basis functions in order to approximate the policy functions.Let d denote the dimension of the state space ( The accuracy decreases with increasing number of the state space dimension.For a higher number of shocks i t e , the likelihood for a larger deviation from the steady-state values increases and we have to pick a wider interval for the approximation of the policy functions.Naturally, the goodness of fit decreases.

4-country model with
4 N = ).In this case, we simply take the ( ) , where In the following, we suggest a very easy-to implement method for high-dimensional state spaces.Assume that the number of coefficients is equal to m.Just pick any m Chebyshev nodes and compute the solution.In very few cases, we find that this procedure does not converge and the modified Gauss-Newton algorithms fails.In this case, restart the program for another random pick of the Chebyshev nodes.The accuracy of this procedure is reported in Table 4 for the 4-country case. 18Notice that the accuracy, at least for the relevant range of the state space, is less than in the case of Galerkin projection.
However, the gain in speed is tremendous, especially if we consider the case for 8 countries with a corresponding number of 16 state variables.While Galerkin takes several days, our ad hoc collocation algorithm is able to compute a very accurate solution with a mean deviation of 0.016% within minutes. 19 summary, the proposed collocation procedure is less accurate than Galerkin.However, for the researcher who is time constrained, it may provide a valuable alternative.In addition, the solution from the Collocation projection can be used as an initial guess for the Galerkin method.By this procedure, the number of iterations over ϕ in the Newton Raphson algorithm can be reduced to  We have also found this result to hold in other problems.In [1], Section 6.3.4 for example, we consider an equity-premium model with 3 states.The proposed ad hoc collocation procedure is found to perform equally well.DOI: 10.4236/jmf.2018.82021330 Journal of Mathematical Finance one or two and the computational time declines significantly.This savings in time comes at almost no costs as the additional programming is very little.

Least Squares
The least squares method minimizes the (weighted) sum of squared residuals: ( ) One possible way to implement this method is to simply compute the sum of squared residuals over an n-dimensional grid of the state space.As a natural choice of the grid points in each dimension of the state space, we pick the Chebyshev nodes (Chebyshev uses constant weights).As, however, the state space dimension increases, we rather use monomial formulas to compute the integral (19).For this procedure, our results are discouraging if we use the log-linear solution as the initial guess.Typically, the mean residuals in our simulation of the economy are of the magnitude 10 −1 , which, of course, is not satisfactory.Acoordingly, as the first important result of our analysis, we find that least squares do not work well if the initial solution is not close to the true solution.As the nature of the model in Section 2 is rather complex and number of parameters for the policy coefficients is rather large, the minimization problem (19) displays many local minima. 20 could only conceive a least squares algorithm with reasonably accurate results when we used the solution from our ad hoc collocation procedure as an initial guess for ϕ.The performance of the algorithm in this case is summarized in Table 5.As you can see, the accuracy is much higher than in the case of collocation projection and is comparable to the one with Galerkin projection.
However, the computational time is much higher than with Galerkin as the We also applied a genetic search algorithm in order to provide a more sophisticated guess for ϕ.In particular, we used a simple and fast genetic search algorithm based on [14] that is described in detail in [1], Chapter 8.However, results did not improve much and computational time became a limit for N = 6 already.
Together with R they define an inner product given by

.
The labor supply is fixed, 1 l ≡ , and the steady state capital stock is normalized to one, 1 k = .Furthermore, we use a Cobb-Douglas production function ( 0 µ = ), and the utility function (benchmark case, we approximated the policy function for the capital stock on the sets [ ] 0.90,1.10for the individual capital stocks.For the technology level i t lower.If the parameters of the problem are chosen such that the lengths of the intervals i i b a

1 p
But at which set of points jx should the residual function equal zero?It is well known from the so called Chebyshev interpolation theorem17 that the Chebyshev zeros minimize the maximum interpolation error.For this reason, one should use the Chebyshev nodes of the Chebyshev polynomial of order + if we approximate the function by a Chebyshev polynomial of order p.

Table 2 .
Galerkin projection.Notes: The runtime for the modified Newton-Rhapson algorithm with line search on a Pentium III, 846 MHz, amounted to 4 minutes 49 seconds and the program needed 8 iterations over ϕ.
Rhapson algorithm with line search on a Pentium III, 846 MHz, amounted to 4 minutes 49 seconds and the program needed 8 iterations over ϕ.Our algorithm, therefore, is reasonably fast and
18Table 3 reports the maximum values from 10 runs of the collocation algorithm.

Table 5 .
Least Squares projection.Notes: The runtime for the Quasi-Newton algorithm with line search on a Pentium III, 846 MHz, amounted to 11 minutes 4 seconds and the program needed 40 iterations over ϕ.