^{1}

^{*}

^{2}

This paper developed a short-term stock exchange prediction model using the Box-Jenkins approach. In this study, monthly data from Ghana Stock Exchange market report that spans from March 2013 to February 2018 were used to develop the model. ARIMA (0, 2, 1) model was fitted to the data based on the Bayesian Information Criterion (BIC) for model selection. Diagnostic checks showed that the residuals of the fitted model were uncorrelated. The developed model was used for forecasting for a period of six months. The trend of the forecasted values showed a significant increase in the Ghana Stock Exchange performance for the next six months.

A stock exchange market is the center of a network of transactions where buyers and sellers of securities meet to provide a clear indication of the market price for each investment. The exchange also plays a key role in the mobilization of capital from shareholders for companies in exchange for shares in ownership to investors in emerging and developed countries. This leads to growth of industry and commerce of the country; and this is a consequence of liberalized and globalized policies adopted by most emerging and developed governments [

Even though the stock exchange markets have been classified as the most volatile in the world and are full of anonymity and escapade performances [

In the stock exchange market, it is known that changes in the stock prices as well as the returns may be attributed to various prevailing risks and events such as economic crisis, natural disasters, movements in international oil prices, inflation effects, foreign exchange rates, changes in government policies, regulations and norms occurring within a country and across the world [

For years, the relationship between financial sector development and real economic activity has been a debatable issue in theoretical and empirical research [

Due to the importance of accurately forecasting stock exchange prices, various forecasting methods have been applied in literature. These methods can be grouped into three categories as artificial intelligence, multivariate analysis and time series models. Artificial intelligence methods such as artificial neural networks are advance computing tools that have recently been applied to time series forecasting. Although very good forecasting performance is given, their forecasting results depend on many factors such as large training data points, extensive training period to reach convergence and data partition technique used. In the case of the multivariate analysis the forecasting results rely on the independent variable(s) employed into the modelling and avoidance of multicollinearity. In the analytical time series, a good forecasting result is achieved on condition that the data being analysed is stationary [

References [

An example of the stock market which requires attention is the Ghana Stock Exchange (GSE). The GSE plays an important role in the economic development of Ghana and its corporate finance. It is a well-known fact that, an organised and well managed stock market stimulates investment opportunities by recognizing and financing productive projects that would lead to real economic activities. Reference [

Since systemic risk in GSE performance hugely affects stock market investments and the country’s economic development, this study seeks to develop a time series model based on Box-Jenkins approach to help capital investors to identify the trend in the GSE and to forecast them appropriately.

Related WorksReference [

Reference [

In Thailand, [

Reference [

Reference [

The study used two main resources:

1) Monthly data that spans from March 2013 to February 2018 obtained from Ghana Stock Exchange Market Report (

2) R Statistical software.

Month | YEAR | |||||
---|---|---|---|---|---|---|

2013 | 2014 | 2015 | 2016 | 2017 | 2018 | |

Jan | 2255.52 | 2173.95 | 2004.12 | 1776.40 | 3076.98 | |

Feb | 2420.91 | 2177.95 | 1972.18 | 1854.53 | 3337.20 | |

Mar | 1777.50 | 2386.34 | 2220.37 | 1912.02 | 1865.01 | |

Apr | 1800.70 | 2255.27 | 2272.77 | 1828.78 | 1896.13 | |

May | 1884.26 | 2319.12 | 2362.63 | 1758.35 | 1919.71 | |

Jun | 1986.29 | 2373.38 | 2352.23 | 1787.50 | 1964.55 | |

Jul | 1989.55 | 2300.35 | 2198.33 | 1796.29 | 2256.78 | |

Aug | 2030.96 | 2200.18 | 2154.77 | 1805.36 | 2389.01 | |

Sep | 2099.88 | 2239.68 | 2009.52 | 1774.90 | 2326.09 | |

Oct | 2123.75 | 2249.33 | 2013.22 | 1728.37 | 2361.48 | |

Nov | 2145.20 | 2266.92 | 1974.02 | 1575.71 | 2521.67 | |

Dec | 1777.50 | 2261.02 | 1994.91 | 1689.09 | 2579.72 |

Source: Ghana Stock Exchange Monthly Market Report, 2018.

In this study, the Ordinary Least Squares (OLS) technique was used to fit a regression equation to the GSE time series data. The essence according to [

Knowledge of the linear trend projection enables the modeller and the user to:

1) Describe historical trend patterns;

2) Permits the projection of past pattern of trends into the future; and

3) Eliminate the trend component from the time series data.

Consider the Simple Linear Regression (SLR) given in Equation (1).

y t = β 0 + β 1 t (1)

where

y t = Ghana stock exchange value.

β 0 = fixed composite index at t = 0 .

β 1 = unknown parameter to be determined from data.

t = monthly duration (time in trend analysis).

From OLS method that minimises the sum of squares errors, Equations (2) and (3) are obtained as follows:

β 1 = n ∑ y t − ∑ y ∑ t n ∑ t 2 − ( ∑ t ) 2 (2)

β 0 = ∑ y n − β 1 ∑ t n (3)

where

n is the sample size.

Hypothesis Testing

The hypothesis for the study is formulated as follows:

H_{0}: β 1 is zero.

H_{1}: β 1 is different from zero.

According to [

ARMA processes form the core of time-series analysis. According to [

y t = ϕ 0 + ϕ 1 ε t − 1 + ε t (4)

where

ϕ 0 and ϕ 1 are unknown model coefficients whose actual values would be determined from data, and ε t is a white noise process.

The first order autoregressive abbreviated AR (1) has the following dynamics (Equation (5)):

y t = θ 0 + θ 1 ε t − 1 + ε t (5)

where

θ 0 and θ 1 are the unknown model coefficients whose actual values would be determined from data. ε t is a white noise process. An Autoregressive Moving Average process with orders P and Q; ARMA (P, Q) has the following dynamics (Equation (6)):

y t = θ 0 + ∑ θ p y t − p + ∑ ϕ q ε t − q (6)

Assumptions

1) The ε t is independent identically distributed.

2) ε t ~ N ( 0 , σ 2 ) .

Hypothesis Test

The hypothesis for the study is formulated as follows:

H_{0}: Series is not stationary

H_{1}: Series is stationary

In formulating the OLS model, a statistical description of the data (

s n = 1 n ∑ i = 1 n ( y i − y ¯ ) 2 (7)

g = ∑ i = 1 n ( y i − y ¯ ) 3 ( n − 1 ) s n 3 (8)

k = ∑ i = 1 n ( y i − y ¯ ) 4 ( n − 1 ) s n 4 (9)

where

y t = Ghana Stock Exchange Value.

y ¯ is the mean value of the Ghana Stock Exchange Value.

n is the sample data size.

Consequently, from the analysis of the GSE using Equations (1), (2) and (3), the linear model was developed (Equation (10)).

y t = 2035.833 + 2.549 t (10)

Analysis of variance (ANOVA) test was then performed to find the significance of the developed model (Equation (10)) coefficients (see

Since F_{critical} < F_{computed} , the null hypothesis is accepted; and it was concluded that the estimated β_{1} is not statistically significant at 5% level of significance. Thus, at 5% level of significance, there exists no relationship between GSE and time. Hence, instead of developing linear regression analysis model, time series analysis model was resorted to instead.

Time series plot and Augmented Dickey-Fuller (ADF) nonstationarity test were performed to verify the nonstationarity of the GSE data which could have caused the generation of wrong model parameters if not corrected.

The Augmented Dickey-Fuller (ADF) stationarity test performed on the data. The test gave a p-value of 0.99 which is greater than α = 5% level of significance. Hence, the null hypothesis that the series is not stationary is accepted.

Graphical plots such as Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) were further carried out to confirm the nonstationarity of the data. This can be seen in

Data Size | Mean | Standard Deviation | Kurtosis | Skewness | Minimum Value | Maximum Value |
---|---|---|---|---|---|---|

60 | 2113.58 | 312.14 | 3.75 | 1.36 | 1576.71 | 3337.2 |

Sources of Variation | Degrees of Freedom | Sum of Squares | Mean Square | F statistic |
---|---|---|---|---|

Regression | 1 | 116,933.298 | 116,933.298 | 1.204 |

Residual | 58 | 5,631,404.371 | 97,093.179 | |

Total | 59 | 5,748,337.669 |

due to the presence of an upward movement. As a result, ADF test was performed to confirm the claim.

The ADF test shows that the differenced data was not stationary since the p-value = 0.6235 was still greater than α = 0.05 significance level. Therefore, the first differenced data was differenced again (see

significance bounds, but all other autocorrelations are below the significance bounds. The PACF on the other hand, shows that the partial autocorrelations at lags 1, 2 and 5 exceed the significance bounds and are slowly decreasing in magnitude with increasing number of lags. Clearly, from these plots, MA and AR terms are respectively identified. Since the ACF plot (

The ADF test shows that the second differenced data is stationary since it has a p-value of 0.01 which is less than α = 0.05 significance level and that confirms the claim of a stationary time series. Consequently, an ARIMA (p, 2, q) model is probably appropriate for the GSE data.

After the model identification, Bayesian Information Criterion (BIC) as well as the coefficient of determination, R^{2}, were used for the selection of the reliable model. ^{2} values. The R^{2} is a model goodness of fit measure of prediction accuracy. From ^{2} values of 704.5556 and 0.9010 respectively is ARIMA (0, 2, 1); and it was selected as the best model that fits the GSE data well. Thus, the autoregressive order p is the lag value after which the PACF plot crosses the upper confidence interval for the

ARIMA Model | BIC | R^{2 } |
---|---|---|

ARIMA (0, 2, 1) | 704.5556 | 0.9010 |

ARIMA (0, 2, 2) | 706.9725 | 0.9040 |

ARIMA (5, 2, 0) | 708.5533 | 0.9220 |

ARIMA (5, 2, 1) | 712.5371 | 0.9220 |

ARIMA (5, 2, 2) | 716.5008 | 0.9220 |

first time. In our case, the PACF plot of the second differenced GSE graph (

ARIMA (0, 2, 1) model explained about 90% of the total variation in the composite index data set.

Consequently, the ARIMA (0, 2, 1) model (Equation (11)) to be used for forecasting was formulated.

y t = 2 y t − 1 − y t − 2 + ε t + 0.7409 ε t − 1 (11)

Equation (8) was used for six-month monthly forecast of the GSE.

Month | Forecast Values |
---|---|

March | 3542.077 |

April | 3746.954 |

May | 3951.832 |

June | 4156.709 |

July | 4361.586 |

August | 4566.463 |

additionally be confirmed from

In this paper, ARIMA (0, 2, 1) model has been developed from the observed GSE monthly market report data over a period of five consecutive years to predict future stock exchange prices or returns. In developing the ARIMA (0, 2, 1) model, nonstationarity which existed in the GSE sample data and could have caused wrong statistical inferences was resolved by differencing the data twice to ensure that the data is stationary. A confirmatory test to verify the stationarity of the GSE data was also carried out using the widely known Augmented Dickey-Fuller (ADF) test.

Diagnostic check was performed by using ACF residuals plot for GSE second differenced data to ensure that there is no autocorrelation in the residuals. This suggests that all the information in the GSE second differenced data was used for the model development.

ACF and PACF plots were used to determine the appropriate ARIMA developed model. After the model identification, Bayesian Information Criterion (BIC) as well as the coefficient of determination, R^{2}, was used for the selection of the reliable model. Consequently, the corresponding R^{2} of the developed ARIMA model explained about 90% of the total variation in the composite index. The developed ARIMA (0, 2, 1) model was used for forecasting for a period of six months and the trend of the forecasted values showed a significant increase in the GSE. In conclusion, the ARIMA (0, 2, 1) is a good model that can be relied upon by companies and investors to predict accurate future stock prices or returns.

The authors are thankful to Ghana Stock Exchange for providing us with the necessary data for this study to be a success.

The authors declare no conflicts of interest regarding the publication of this paper.

Boye, P. and Ziggah, Y.Y. (2020) A Short-Term Stock Exchange Prediction Model Using Box-Jenkins Approach. Journal of Applied Mathematics and Physics, 8, 766-779. https://doi.org/10.4236/jamp.2020.85059