1. Introduction
Recently, stock price/index forecasting has become an important and interesting aspect among investors, professional analysts and researchers. Although the main goal is slightly different, the core idea is to earn high profit along with the minimum risk, and achieve the most accurate forecasting results based on the less information. The objective could be reached by analyzing the past and current performance of the stock prices and the background information such as fundamental and technical indicators.
Stock price over the period of time belongs to the category of time series which can be forecasted based on several techniques. Time series forecasting is a prominent area and having a long history. For instance, [1] re- viewed the past 25 years of research studies published in elected journals and showed that one third of papers concerned time series forecasting. The history of related studies spreads through the long range from the traditional to new approaches based on the category of subjective/objective and linear/nonlinear. Among them, exponential smoothing, autoregressive integrated moving average (ARIMA), generalized autoregressive condi- tional heteroscedasticity (GARCH) and artificial neural networks (ANNs) are significant objective methods.
The nonlinear methods are widely used in recent forecasting methods due to the existence of nonlinear behavior rather than the linear in stock prices. Reference [2] compared linear and nonlinear forecasts for stock returns and showed that the nonlinear forecasts are significantly more accurate than the linear forecasts. Reference [3] surveyed more than 100 published articles focusing on soft computing methods such as artificial neural network and Neuro-fuzzy techniques. The results showed that the soft computing methods are more suitable for stock price forecasting and in the most cases, they outperform the conventional models.
Neural networks are popular in many areas especially in industrial applications because its ability to grasp patterns of the system and easily adapt changes in the systems as well. According to the well documented literature, a huge number of studies have done based on ANN models. Reference [4] provided a practical guide to design a neural network for forecasting economic time series. In that case, an eight step procedure was explained through the three components including data preprocessing, training and topology of the network. Moreover, in each step they discussed some rules of thumb, common mistakes, Parameter selection and other related things in the designing process. Reference [5] reviewed the theoretical background of the neural networks and investigated for designing and implementing a trading alert system using back propagation network. The result pointed out that the implemented system can predict the short-term price movement directions with an overall accuracy of over 74%. Based on these results they argued that the back propagation network is useful for prediction without the use of extensive market data or knowledge.
In the recent literature, large number of research studies on stock price forecasting was used artificial neural network models ( [6] - [9] ). Some of them used ANNs in comparative studies rather than using single model. Some studies used ANNs to show that its ability in forecasting performance than the other conventional methods. In the meantime, considerable studies were turned to use ANNs in hybrid models ( [10] - [19] ). However, the structure is different from each other, most of the studies have shown that the importance and superiority of ANNs in forecasting while some of them pointed that its drawbacks.
This study presents an application of feed forward back propagation neural networks to determine the accuracy of one step ahead forecasting. The different models are proposed based on the model performances of the test sample by changing the number of inputs, learning rate, number of hidden layer neurons and the number of training sessions. Finally, the best one step ahead forecast is selected on basis of the error performances of the proposed models. Daily information of All Share Price Index (ASPI), All Share Price Return Index (ASTRI), Price to Earnings Ratio (PER) and Price to Book Value (PBV) are used as empirical data set to exemplify the application of the network. The inputs are chosen on basis of the correlation with ASPI. All the data are obtained from Colombo Stock Exchange (CSE).
The rest of the study is organized as follows. Section 2 briefly explains the feed forward back propagation neural network used in this study. The empirical results and discussions are presented in Section 3. The concluding remarks are made in Section 4.
2. Artificial Neural Network
Artificial neural network is a mathematical and information processing system inspired by biological neural system. Artificial neurons are considered building elements of the system that is composed of one or more inputs, weight and the output. On basis of aforesaid components, the network can map an input into desired output as nonlinear functions. The basic structure of the artificial neuron model can be depicted in Figure 1.
Where is a scalar input, is a scalar weight, is a scalar bias. represents the net input to the next layer and then it passes through the transfer function. Finally it gives the scalar output [20] . The neuron output a can be calculated using Equation (1) as follows:
(1)
2.1. Feed Forward Back Propagation Neural Network
The network topology can be divided into two distinguish forms depending on its connections as feed forward and recurrent. In the feed forward neural network architecture, the artificial neurons are organized as layers and the information strictly flows forward, which is from input layer to output layer. The errors of the network are propagated backwards and hence it is known as feed forward back propagation network (BPNN). The architecture of this network is consisted of input layer, one or more hidden layers and output layers.
When modeling time series data with nonlinear structures, the historical information reveals that the three layers which consist of input layer, one hidden layer and output layer feed forward back propagation network is the widely used network model. The supervised learning method is used in this back propagation process to compare the networks outputs with the actual outputs on basis of least mean square error performance and then the randomly started weights are adjusted to minimize the error ( [4] [18] ). The general structure of the three layer back propagation neural network is shown in Figure 2.
Where, represents the inputs; represents the outputs of the hidden
layer; represents the outputs of the network; and represent the connection weights.
Net input, the output of the node in the hidden layer and can be calculated using Equation (2), Equation (3) and Equation (4) as follows.
(2)
(3)
(4)
and are transfer functions of hidden layer and the output layer which can be log-sigmoid, tan-sigmoid, pure linear,… etc. In many studies, nonlinear transfer function as sigmoid is used in the hidden layer and pure linear is used in the output layer as a linear transfer function.
2.2. Evaluation Criteria
The best neural network model is evaluated using three types of errors as mean absolute error (MAE), root mean squared error (RMSE) and mean absolute percent error (MAPE) on basis of the testing sample. Moreover, the same error values were considered to compare the forecasting performances. The corresponding criteria are given in Equation (5), (6) and Equation (7).
(5)
(6)
Figure 2. A three layer feed forward back propagation neural network.
(7)
where, represents the actual value of ASPI, represent the predicted value of ASPI and is the number of observations in the sample.
3. Experimentation Design
3.1. Data Sets
In this study we use daily data including ASPI, ASTRI, PER and PBV collected from CSE over the period from January 2nd 2012 to March 20th 2014. The total number of observations is 535 trading days and 80% of the data is selected as the training sample. The remaining 20% is used as the testing sample.
3.2. Proposed Feed Forward back propagation network model
The neural network toolbox of MATLAB software is used to implement the proposed model. The original data is used as inputs and targets since the network is automatically normalize the data into the range [−1, 1]. The proposed network architecture consists of three layers namely input layer, hidden layer and output layer. Hyperbolic tangent sigmoid nonlinear transfer function is used for the hidden layer and the linear transfer function is used for the output layer. The proposed network is implemented through two steps on basis of dividing the daily information as ASPI data and combinations of ASPI, ASTRI, PER, PBV data. In each stage, the experiment is tested at several learning rates of 0.01, 0.02, 0.03, 0.04 and 0.05.
When initializing the same network and train again and again, the result of the network would be slightly different due to the different network parameters. Hence, this could be caused to the accuracy of the solution. To overcome this situation, we trained the each network 10, 20, 50 and 100 times respectively with randomly selected training samples and considered the average values of the training and testing error. The same procedure is applied by changing the number of hidden layer neurons from one to twelve. The feed forward back propagation network was training with 1000 iterations as the convergence criteria. At the end of the experiment, the optimal model is chosen under the each category. The purpose of the study is, forecasting next trading day ASPI value. Therefore, the selected eight models are used in one step ahead forecasting. The forecasting process is also implemented 10, 20, 50 and 100 times at each model and the average value is considered the optimal forecasting value. The optimal model and the forecast are chosen on basis of error performances of the testing samples and forecast samples.
4. Results and Discussions
4.1. Model Selection
A feed forward back propagation neural network was trained 10 times for different random training samples by setting the learning rate 0.01 and number of hidden layer neurons as 1. The average performances were obtained by employing MAE, RMSE and MAPE. In the next, the same procedure was followed by changing the learning rate as 0.02, 0.03, 0.04, and 0.05 respectively. Then the same procedure was repeated by increasing the number of hidden layer neurons up to 12. Likewise, different average performances were obtained by increasing the number of trainings as 20, 50 and 100. Based on the number of inputs, number of training times, number of hidden layer neurons and learning rates, eight models were selected and the corresponding information is summarized in Table 1.
From Table 1, we can see that, BPNN topology 4-8-1 which gives the minimum values of MAE, RMSE and MAPE at the number of 50 training times and learning rate 0.05. The corresponding experimental results at the 50 trainings including testing error performances at the number of hidden neurons (NN) and the learning rates (LR) are shown in Table 2.
4.2. Forecasting Results
The eight models displayed in Table 1 were used for one step-ahead forecasting to obtain the next day ASPI value. The average forecast value was considered at each stage and the actual value of next trading day was used to calculate MAE, RMSE and MAPE. The results are displayed in Table 3.
According to Table 3 results, we can see that the models with four inputs always give the best forecasts than the model with one input. Among them, BPNN(4-8-1) gives the optimal forecast after training the process 50 times. It gives the next day value as 5920.4 together with the minimum error performances 17.47, 17.47 and 0.0029 respectively.
5. Conclusions
In this study we attempted to find the best forecast value for the next trading day ASPI on CSE. For this, feed forward back propagation neural network was used by dividing the inputs into two categories. The experiment was implemented by changing the learning rate, number of hidden layer neurons as well as the number of training times. The best network topology and the forecast value were determined by employing MAE, RMSE and MAPE.
The results showed that the best network topology is 4-8-1 after repeating the experiment 50 times, which consists of four input data series with eight hidden layer neurons at 0.05 learning rate. The results show that the accuracy of the network model was improved when the number of inputs and hidden layer neurons increases. Therefore, the number of inputs, and the number of hidden layer neurons are important parameters to obtain more accurate results.
As a shortcoming, the network model gives different results at each and every training stage due to the different network parameters. The number of training times would be more important to sharpen the result. Sometimes, over fitting and under fitting could be occurred due to this. Therefore, it is suitable to do more experiments in different training times. To overcome this situation and for obtaining a feasible solution, we repeated each experiment 10, 20, 50 and 100 times respectively and observed the average value of the output. According to the results, a slight difference of the accuracy of the model could be gained by changing the number of train-
Table 1. Summarized information of BPNN models at four different training times.
^{*}Denotes the selected model with the minimum error values.
Table 2. Training and testing error performances of number of trainings at 50.
Table 3. Forecasting results of the eight models.
The actual value of the next day is 5937.87.
ings. For both types of inputs, the experimental results were better at 50 numbers of training times than others and the optimal result was given by BPNN(4-8-1) at 50 trainings. The special changes in the forecast could not be obtained when the number of trainings was below and above 50 since we considered the average value of each and every output.
As the conclusion of this study, we can determine that the number of highly correlated inputs and number of hidden layer neurons are much important and can be considered as major factors to get the more accurate solution. The number of training times and the average value of the number of trials can be taken as minor factors to improve the results. Apart from that, we can change the training function to test whether that gives the better results than the previous. Moreover, for this experiment, we can not observe that the considered range of learning rate is an important factor for improving the result.
Acknowledgements
This research was financially supported by the National Natural Science Foundation of China (Grant NO. 71172043).
NOTES
^{*}Corresponding author.