^{1}

^{*}

^{2}

Predicting stock price movement is generally accepted to be challenging such that until today it is continuously being attempted. This paper attempts to address the problem of stock price movement using continuous time models. Specifically, the paper provides comparative analysis of continuous time models—General Brownian Motion (GBM) and Variance Gamma (VG) in predicting the direction and accurate stock price levels using Monte Carlo methods—Quasi Monte Carlo (QMC) and Least Squares Monte Carlo (LSMC). The hit ratio and mean-absolute percentage error (MAPE) were used to evaluate the models. The empirical tests suggest that either the GBM model or VG model in any Monte Carlo method can be used to predict the direction of stock price movement. In terms of predicting the stock price values, the empirical findings suggest that the GBM model performs well in the QMC method and the VG model performs well in the LSMC method.

In this paper we deal with the problem of prediction of stock price movement (increase or decrease) that has been there over years. Several methods have been proposed and have predicted stock price movement with variable degrees of accuracy such that until today, prediction of stock price movement is continuously being attempted. A manifold of factors such as economical, political, social and psychological factors interact in a complex way influencing stock price movement. It is no doubt that prediction of stock price movement is quite challenging. This paper is an attempt to predict stock price movement using continuous time models. We believe that continuous models are suitable to capture the unpredictable dynamics of stock prices to a certain extent.

The models for prediction of stock price movement have several uses to researchers and practitioners alike which include optimal portfolio construction and executing best informed buy/sell orders. Also, a major boost for the models of stock price movement is in derivatives models to determine fair values of derivatives and simulation models for risk management purposes.

Most studies have focused on the accurate estimation of the value of stock price. In most cases, the accuracy of the estimations is measured by the error between the estimates and observed values. However, different investors use diverse trading strategies and strategies based on minimizing the error between observed values and the estimates may not appeal to them [

In the literature, there exist a vast number of articles addressing the accurate estimation of the value of stock price. However, to the best of our knowledge, there are few studies which look at the predictability of the direction of stock price. In this respect, we cite studies by [

We follow the approach of [

[

The assumptions on which the continuous models are based meet the rules imposed by the weak Efficient Market Hypothesis. The weak Efficient Market Hypothesis guarantees transparency in the sense that it gives everyone the same information about a stock. According to the hypothesis, the only relevant information about a stock is the current value, so as to be able to determine future stock price movement.

The rest of the paper is organized as follows: the Monte Carlo techniques used for simulating stock price processes are discussed in the next section. In Section 3 we discuss the dynamics and parameter estimation of the models used in this work. Then we discuss the design of the experiments in Section 4. In this section we also provide results of the performance of the models. Section 5 formulates our conclusions and carries summary of our findings.

We look at the techniques for simulating the stock price processes encountered in this work. The Monte Carlo methods lend themselves naturally to this task as they are useful in estimating numerically the values of integral expressions especially in high dimensions. The simulation procedures used in here are found in [

The integral of a Lebesgueintegrable function

The Monte Carlo quadrature formula is based on the probabilistic interpretation of the integral. Now, consider a sequence

According to the Strong Law of Large Numbers, the approximation converges to the true value of the integral:

This means that

Now in crude form, Monte Carlo simulations are computationally inefficient. A large number of simulations are generally required so as to achieve high degree of accuracy. However, the efficiency can be improved by either using other methods such as variance reduction method, quasi-Monte Carlo method or least-squaresregres- sion Monte Carlo method [

Quasi-Monte Carlo (QMC) method, also called low-discrepancy, can be described in simple terms as the deterministic method of the crude Monte Carlo method. The random samples in the Monte Carlo method are instead replaced by well-chosen deterministic points. Quasi-Monte Carlo thus makes no attempt to mimic randomness. It rather generates sample points that are literally too evenly distributed to be random and thus selectively tries to increase accuracy [

Suppose for the unit cube integration domain

which formally looks like the crude Monte Carlo estimate but is now used with the deterministic points

It is intuitively clear that:

and that the points x_{k} are chosen so as to fill the hyper-cube uniformly, and achieve a maximal degree of uniformity and a low degree of discrepancy. The discrepancy is a measure of the “level of uniformity” or more exactly the deviation from uniformity. It is defined as:

where

we define the star discrepancy

In this work, we use Halton sequences. Halton sequences are generally d-dimensional sequences with values in the unit hypercube^{th} prime numbers as the base.

Consider a reward function that depends on both

where r is the discount factor. Suppose

where we let

where

This is a typical optimal stopping problem whereby an investor has to decide the right time or optimal time to sell a stock in order to maximize the expected reward. In particular, we are interested in the optimal expected reward at an optimal stopping time. The findings from a study by [

To tackle this stochastic control problem, we use Dynamic Programming. The main idea originated from [

It is optimal to stop at time n if and only if

To continue with calculations, we need to estimate the conditional expectation term, that is, the continuation value

We generate N independent sample paths of the process X, denoted by

Ordinary Least Squares (OLS) regression is performed to find the parameter vector

Once we have determined the vector

For the basis regression functions, we choose the weighted Laguerre polynomials suggested by [

where

In this section, we discuss the dynamics of the models used in this work. The price of a particular stock at a future time t is usually unknown at the present time. Thus, we think of the stock price as a random variable. The stock price process is denoted by

The Geometric Brownian Motion (GBM) stock price dynamics under the risk neutral measure are given by:

where r is the risk-free rate and

Integrating the above stochastic differential equation over the time interval

This gives us the recursive expression for

Assuming an equidistant grid, let

Variance Gamma (VG) process falls under the class of infinite activity pure jump models. In the pure jump models, one does not have to introduce a diffusion component since the dynamics of jumps already are rich enough to generate non-trivial small time behavior. However, models of this type can be constructed by Brownian subordination. Subordinating Brownian motion with drift

The subordinating process under the Variance Gamma model is the Gamma process. The Gamma process, like the Poisson process, is a pure jump process with no diffusion component. Jumps of negligible size arrive infinitely often in the Variance Gamma model and the infinite activity allows the model to behave like a diffusion process for small jumps. Jumps of non-negligible size occur with a finite frequency and the arrival rate of these jumps decrease monotonically with the jump size.

So, the Variance Gamma process uses a Gamma process to time change a Brownian motion. Instead of evaluating a Brownian motion at time t, rather it is evaluated at time

The dynamics of the stock price under the risk-neutral measure are:

where

We can write the dynamics in Equation (25) as:

Consider the dynamics over the time interval

The parameters of the General Brownian model (with parameter

where

It is often simpler to maximize the log-likelihood function:

[

However, solving the likelihood equations is often non-trivial. We had to rely on global optimization algorithms to obtain the maximum likelihood estimates. It should be hinted that the solutions yielded by these algorithms were sensitive to the choice of initial values because the log-likelihood function may have several local minimas.

We selected 19 stocks from the ALSI40 (JSE Top 40 Index) for the purpose of this work, which are representative of the different industry sector categories. The share code, share name and industry sector categories are given in

Continuous time models provide better predictions when used to model stock prices over longer periods of time rather than short periods of time. As a result, we try to predict the stock price movement on a weekly basis.

The predictions were made at the start of the first business day of the week.

The models were tested for their prediction capabilities when the market was generally trending downwards as well as when the market was generally trending upwards. For the former, the sample data set runs from May 2008 to April 2009, and for the latter, the sample data set runs from September 2010 to August 2011. Both sample data sets span for a period of approximately a year each. About two thirds of the observations were used for In-Sample predictions and the remainder of the observations were used for one-step ahead forecast, which we consider as our out-of-sample prediction forecast. The statistical tests were performed at 5% significance level

Share Code | Share Name | Trading Sector |
---|---|---|

AGL | Anglo American Plc | General Mining |

AMS | Anglo American Platinum Plc | Platinum and Precious Minerals |

ANG | Anglo Gold Ashanti Ltd | Gold Mining |

APN | Aspen Pharmacare Holding Ltd | Pharmaceuticals |

BGA | Barclays Africa Group Ltd | Banking |

BIL | BHP Billiton Plc | General Mining |

BVT | Bidvest Group Ltd | Business Support Services |

GRT | Growthpoint Property Ltd | Real Estate Investment and Holdings |

IMP | Impala Platinum Holdings Ltd | Platinum and Precious Minerals |

INL | Investec Ltd | Investment Services |

IPL | Imperial Holdings Ltd | Transportation Services |

MTN | MTN Group Ltd | Mobile Telecommunications |

NPN | Naspers Ltd | Media |

OML | Old Mutual Plc | Life Insurance |

SAB | SabmillerPlc | Beverages |

SBK | Standard Bank Group Ltd | Banking |

SHP | Shoprite Holdings Ltd | Food Retailers and Wholesalers |

SOL | Sasol Ltd | Oil and Gas Producers |

WHL | Woolworths Holdings Ltd | Clothes Retailers |

Now, we evaluate the performance of the models discussed in the previous sections. From the numerous choices of performance evaluation metrics, we choose to use the mean-absolute percentage error (MAPE) and hit ratio to evaluate the performance of the models. The mean-absolute percentage error (MAPE) measures the magnitude of error from the observed prices in percentage terms. The formula for calculating MAPE is:

where ^{th} actual observation (price) and

The hit ratio is simply the accuracy of the predicting model measured in percentage over number of predictions. It is determined by the number of correct signs (correct price direction) divided by the total number of predictions. The hit ratio is given as:

For each performance evaluation metric, we make a comparative analysis of the models―GBM or VG, under each simulation method―QMC or LSMC. Furthermore for each performance evaluation metric, we make a comparative analysis of the simulation models―QMC and LSMC, for each type model―GBM or VG. The following are the findings of the evaluations for the two different time periods selected.

the Out-Sample, the average hit ratio for the GBM model is 51.64% and the VG model is 51.97%. The All Periods average hit ratios for the GBM model and VG model are 54.25% and 54.55%, respectively. The statistical t-tests showed no significant differences in the hit ratios from using either of the models for all the different testing periods (In-Sample, Out-Sample, All Periods).

In addition, to assess which simulation method is better for hit ratios we made a comparative analysis of each type of model (GBM model or VG model) under each type of simulation method.

Period | Mean QMC | Mean LSMC | t-Value | p-Value |
---|---|---|---|---|

In-Sample | 55.70% | 52.05% | 1.45 | 0.16 |

Out-Sample | 51.97% | 49.01% | 0.80 | 0.43 |

All Periods | 54.25% | 50.81% | 1.90 | 0.06 |

Period | Mean GBM | Mean VG | t-Value | p-Value |
---|---|---|---|---|

In-Sample | 55.41% | 55.70% | −0.13 | 0.90 |

Out-Sample | 51.64% | 51.97% | −0.11 | 0.92 |

All Periods | 54.25% | 54.55% | −0.17 | 0.87 |

Period | Mean GBM | Mean VG | t-Value | p-Value |
---|---|---|---|---|

In-Sample | 51.02% | 52.05% | −0.43 | 0.67 |

Out-Sample | 50.33% | 49.01% | 0.33 | 0.74 |

All Periods | 50.81% | 51.11% | −0.15 | 0.88 |

Period | Mean QMC | Mean LSMC | t-Value | p-Value |
---|---|---|---|---|

In-Sample | 55.41% | 51.02% | 2.00 | 0.05 |

Out-Sample | 51.64% | 50.33% | 0.38 | 0.71 |

All Periods | 54.25% | 50.81% | 1.90 | 0.06 |

Next, we give a comparative analysis for the downward trend using the MAPE as the performance measure.

Under the quasi-Monte Carlo method, the MAPEs for the VG model is greater than those of the GBM model for all the different testing periods. The statistical tests confirm no significant differences between the two models in predicting stock price values.

Again, we test to assess which simulation method is better at predicting stock price values.

Period | Mean QMC | Mean LSMC | t-Value | p-Value |
---|---|---|---|---|

In-Sample | 13.70% | 26.41% | −14.03 | 3.87E−12 |

Out-Sample | 14.61% | 25.72% | −7.40 | 3.78E−07 |

All Periods | 13.98% | 26.29% | −16.73 | 3.17E−13 |

Period | Mean GBM | Mean VG | t-Value | p-Value |
---|---|---|---|---|

In-Sample | 13.70% | 20.59% | −16.59 | 7.23E−18 |

Out-Sample | 14.61% | 21.15% | −8.89 | 8.79E−10 |

All Periods | 13.98% | 20.75% | −18.64 | 6.05E−17 |

Period | Mean GBM | Mean VG | t-Value | p-Value |
---|---|---|---|---|

In-Sample | 26.41% | 7.55% | 18.49 | 5.98E−18 |

Out-Sample | 25.72% | 8.00% | 10.84 | 2.43E−11 |

All Periods | 26.29% | 7.68% | 20.74 | 7.13E−21 |

Period | Mean QMC | Mean LSMC | t-Value | p-Value |
---|---|---|---|---|

In-Sample | 20.59% | 7.55% | 20.81 | 2.21E−19 |

Out-Sample | 21.15% | 8.00% | 13.43 | 2.27E−15 |

All Periods | 20.75% | 7.68% | 20.77 | 5.87E−19 |

Period | Mean GBM | Mean VG | t-Value | p-Value |
---|---|---|---|---|

In-Sample | 46.35% | 47.66% | −0.61 | 0.55 |

Out-Sample | 48.30% | 46.44% | 0.57 | 0.57 |

All Periods | 46.97% | 47.27% | −0.17 | 0.86 |

Period | Mean GBM | Mean VG | t-Value | p-Value |
---|---|---|---|---|

In-Sample | 52.63% | 46.78% | 2.87 | 0.01 |

Out-Sample | 51.70% | 53.25% | −0.42 | 0.68 |

All Periods | 52.33% | 48.86% | 2.00 | 0.05 |

Period | Mean QMC | Mean LSMC | t-Value | p-Value |
---|---|---|---|---|

In-Sample | 46.35% | 52.63% | −3.26 | 0.00 |

Out-Sample | 48.30% | 51.70% | −0.92 | 0.37 |

All Periods | 46.97% | 52.33% | −3.35 | 0.00 |

Period | Mean QMC | Mean LSMC | t-Value | p-Value |
---|---|---|---|---|

In-Sample | 47.66% | 46.78% | 0.39 | 0.70 |

Out-Sample | 46.44% | 53.25% | −2.11 | 0.04 |

All Periods | 47.27% | 48.86% | −0.86 | 0.40 |

Monte Carlo method for the Out-Sample and All Periods combined, except for the In-Sample. We find statistical difference between the simulation methods only for the Out-Sample period.

We move on to give analysis of the models and simulation methods using MAPE. In

In

Next, we give a comparative analysis of the simulation methods in estimating the stock prices.

Period | Mean GBM | Mean VG | t-value | p-value |
---|---|---|---|---|

In-Sample | 16.16% | 22.75% | −4.12 | 5.90E−04 |

Out-Sample | 14.99% | 21.82% | −4.16 | 3.30E−04 |

All Periods | 15.78% | 22.45% | −4.25 | 4.30E−04 |

Period | Mean GBM | Mean VG | t-Value | p-Value |
---|---|---|---|---|

In-Sample | 33.79% | 3.32% | 11.07 | 1.83E−09 |

Out-Sample | 30.94% | 3.57% | 9.65 | 1.53E−08 |

All Periods | 32.90% | 3.40% | 10.89 | 2.37E−09 |

Period | Mean QMC | Mean LSMC | t-Value | p-Value |
---|---|---|---|---|

In-Sample | 16.16% | 33.79% | −5.57 | 5.27E−06 |

Out-Sample | 14.99% | 30.94% | −4.99 | 3.16E−05 |

All Periods | 15.78% | 32.90% | −5.50 | 6.37E−06 |

Period | Mean QMC | Mean LSMC | t-Value | p-Value |
---|---|---|---|---|

In-Sample | 22.75% | 3.32% | 60.96 | 2.57E−31 |

Out-Sample | 21.82% | 3.57% | 25.82 | 2.15E−17 |

All Periods | 22.45% | 3.40% | 60.31 | 1.80E−29 |

This paper addressed the problem of stock price movement using continuous time models. Specifically, the paper provides comparative analysis of continuous time models―GBM and VG in predicting the direction and accurate price levels of stocks using Monte Carlo methods―QMC and LSMC. The performance evaluation metrics used in this paper were hit ratio and MAPE. The t-test was used to show significance. For each performance evaluation metric, we made a comparative analysis of the models―GBM and VG under each simulation method―QMC or LSMC. Furthermore, we made a comparative analysis of the simulation models―QMC or LSMC for each model―GBM or VG. The models were tested for their prediction capabilities when the market was generally trending downwards as well as when the market was generally trending upwards.

For the downtrend analysis, we found no significant difference between the GBM model and VG model in terms of the hit ratio (number of times the predicted direction is correct). We also found no significant difference between the Monte Carlo methods―QMC and LSMC in terms of hit ratios for the downward trend period.

In terms of the MAPEs for the downward trend, there were no significant differences between GBM model and VG model under the QMC method. However, there were significant differences between the GBM model and VG model under the LSMC method. The VG model performs better than the GBM model under the LSMC method in predicting stock price values.

The Monte Carlo methods assessment for the downtrend showed significant differences either using the GBM model or VG model as shown by the MAPEs. The findings hint that the GBM model works well when used in the QMC method whereas the VG model works well when used in the LSMC method.

For the uptrend analysis, we found no significant difference between the GBM model and VG model under the QMC method in terms of the hit ratios. Under the LSMC method there were no significant differences between the GBM model and VG model except for the In-Sample period. In this case the GBM model predicted the direction correctly most of the times in comparison to the VG model.

In the comparison of the Monte Carlo methods, there were significant differences for the GBM model In-Sample and All Periods when used in the Monte Carlo methods. The GBM model predicted the hit ratios most of the times in the LSMC method in comparison to the QMC method. Comparison of the VG model in the Monte Carlo methods showed significant difference for the Out-Sample only with the VG model fairing well in the LSMC method as compared to the QMC method.

In terms of the MAPEs for the uptrend, we found significant differences between the GBM model and VG model. The GBM model fairs better under the QMC method whereas the VG model fairs well under the LSMC method in predicting the stock price values. The findings also show significant differences between the Monte Carlo methods in predicting the stock price values. Again the GBM model performs better in the QMC method and the VG model performs better in the LSMC method.

We summarize the findings as follows: for predicting the direction of stock price (as indicated by the hit ratios), the GBM model or VG model can be used in any Monte Carlo method as most of the times we found no significant differences as evidenced from the t-tests. The hit ratios we obtained are “near” random walk behavior. This hints on how challenging it is in predicting stock price movement. Since the results of the hit ratios are at the same level where even a random predictor can produce them, then our results are justifiable.

For predicting the stock price values (as indicated by the MAPEs), the GBM model performs well under the QMC method and the VG model performs well under the LSMC method. The finding has important implications in risk management simulations.

The authors would like to acknowledge support of this work from DAAD, the German Academic Exchange Service, in association with AIMS (African Institute for Mathematical Sciences) and Centre for Business Mathematics and Informatics. The work was done while Hopolang Mashele was a visiting researcher at AIMS.