Based on Multiple Scales Forecasting Stock Price with a Hybrid Forecasting System

This paper presents an integration prediction method which is called a hybrid forecasting system based on multiple scales. In this method, the original data are decomposed into multiple layers by the wavelet transform and the multiple layers are divided into low-frequency, intermediate-frequency and high-frequency signal layers. Then autoregressive moving average models, Kalman filters and Back Propagation neural network models are employed respectively for predicting the future value of low-frequency, intermediate-frequency and high-frequency signal layers. An effective algorithm for predicting the stock prices is developed. The price data with the Shandong Gold Group of Shanghai stock exchange market from 28 June 2011 to 24 June 2012 are used to illustrate the application of the hybrid forecasting system based on multiple scales in predicting stock price. The result shows that time series forecasting can be produced by forecasting on low-frequency, intermediate-frequency and high-frequency signal layers separately. The actual value and the forecasting results are matching exactly. Therefore, the forecasting result of simulation experiments is excellent.


Introduction
Forecasting is the process of making projections about future performance based on existing historical data.Stock market prediction is regarded as a challenging task in financial time-series forecasting, primarily due to uncertainties involved in the move- ment of the market.Many factors influence the behavior of the stock market, including both economic and noneconomic.So, stock price time-series data are characterized by nonlinearities, discontinuities, and high-frequency multi-polynomial components and predicting market price movements is quite difficult [1].
Methods of forecasting stock prices can be classified into two categories: statistical and artificial intelligence (AI) models.The statistical methods include the autoregressive (AR) model [2], the autoregressive moving average (ARMA) model [2], and the autoregressive integrated moving average (ARIMA) model [2].These models are linear models which are, more than often, inadequate for stock market forecasting, since stock time series are inherently noisy and non-stationary.Some recent proposals of nonlinear approaches include the autoregressive conditional heteroskedasticity (ARCH), the generalized autoregressive conditional heteroskedasticity (GARCH) [3], and the smooth transition autoregressive model (STAR) [4]; yet they fall short in forecasting of stock time series with high frequency which is non-stationary.The AI models such as artificial neural networks (ANNs), fuzzy logic, and genetic algorithms (GAs) without this restriction, have been shown to outperform the statistical models empirically since they can deal with complex engineering problems which are difficult to solve by classical methods [5].Each of AI-based techniques has advantages and disadvantages.Using hybrid models or combining several models has become a common practice to improve forecasting accuracy.[6] As a result, AI approaches can be utilized in predicting stock prices [5]

Hybrid Forecasting System Based on Multiple Scales
The stock market is made up of short-term, middle-term and long-term dealers etc., Short-term dealers only pay close attention to the short-term price changes in the market, the price fluctuations caused by this behavior has only a short-term memory; by contrast, the price that long-term dealers pay close attention to is the market price changing over a long-term range, the price fluctuations caused by this behavior has a long-term memory.As the dealers' investment behaviors are under the influence of the outer environment and their chosen investment tactics, in turn generating completely different characteristics in stock price fluctuations, they are dispersed and reflected correspondingly in different time scales [10].For this reason, this paper adopts a Multiscale forecasting system to predict the stock-market price.
The multiple scales forecasting system is mainly made up of five parts: scale decomposition, high-frequency data forecast, intermediate frequency data forecast, low frequency data forecast, and data composition.The input data is the real stock price, the output data is the predict stock price, its flow diagram is as shown in Figure 1.

Wavelet Transform [11] [12]
Wavelet analysis is based on wavelet, which is a wave form that tends to be irregular and asymmetric it is capable of separating a signal into shifted and scaled versions of the original (or mother) wavelet.Wavelet function ( ) called the mother wavelet has finite energy and is mathematically defined as: , a b t ψ can be obtained as: where a and b are real numbers; For the time series ( ) ( ) or finite energy signal the continuous wavelet transform (CWT) of time series ( ) f t is defined as: on the powers of two (dyadic scales and translation), then the amount of data can be reduced considerably resulting in more efficient data analysis.This transform is called the discrete wavelet transform (DWT) and can be defined as [12]: where m and n are integers that control the wavelet scale/dilation and translation, respectively; a 0 is a specified fined scale step greater than 1; and b 0 is the location parameter and must be greater than zero.The most common and simplest choice for parameters are a 0 = 2 and b 0 = 1.
This power-of-two logarithmic scaling of the dilations and translations is known as dyadic grid arrangement and is the simplest and most efficient method for practical purposes.For a discrete time series, f(t) when occurs at a different time t (i.e.here integer time steps are used), the discrete wavelet transform becomes: ( ) ( )

Forecasting of Low-Frequency Signal Layers
The low frequency data which is gain by wavelet decomposing change slowly, so that it can be regard as steady time array, and forecasted with ARMA model.The low frequency data which was received using the wavelet decomposing methods were changing slowly, so it was regarded as a steady time array, so ARMA model was adopted to predict the share price.
The ARMA model is usually applied to auto correlated time series data.This model is a great tool for understanding and predicting the future value of a specified time series.ARMA is based on two parts: autoregressive (AR) part and moving average (MA) part [13].Also, this model is usually referred as ARMA (p, q).In which p and q are the order of AR and MA respectively.A time series {L t ; t = 0, ±1, ±2, •••} is ARMA (p, q) if it is stationary and: The parameters p and q are called the autoregressive and the moving average orders, respectively.{e t ; t = 0, ±1, ±2, •••} is a Gaussian white noise sequence.i γ and i θ are constants.
The Akaike information criterion (AIC) can also be applied to decide the order of ARMA model.AIC is a measure of the goodness of fitting an estimated model.It is based on the concept of entropy.Entropy is a measure of the information lost when a mathematical model is used to describe the actual data.AIC is a powerful tool for model selection.The model with the lowest AIC has the best performance.The AIC is defined by the following equation: where 2 e σ  is the estimated value of 2 e σ which is the variance of the ARMA.

Forecasting of Intermediate-Frequency Signal Layers
The intermediate frequency data which is gain by wavelet decomposing is non-steady, Figure 3. Back-propagation neural network.
output layer, the forward pattern is then compared with the correct (or observed) output pattern to calculate an error signal.The error signal for each such target output pattern is then back-propagated from the output layer to the input layer in order to appropriately amend or tune the weights in each layer of the network.After a B-P network has learned the correct classification for a set of inputs, it can be tested on a second set of inputs to see how well it classifies untrained patterns.Thus, an important consideration in applying B-P learning is how well the network generalizes.The detailed algorithm can be found elsewhere [16] and is, therefore, omitted in the text.

Application and Results
The data for our experiments are Shandong gold group closing prices, collected on the Shanghai Stock Exchange (SSE) market.The total number of values for the Shandong gold group closing prices is 230 trading prices, from 28 th June 2011 to 24 th June 2012.
The first 200 data was used for testing, the last 30 data was the predicting results, and then made a comparison.Figure 4 shows the original data series.
There are two criteria for the selection of the mother wavelet.Firstly, the shape and the mathematical expression of the wavelet must be selected correctly so that the physical interpretation of the wavelet coefficients is easy.Secondly, the chosen wavelet must allow a fast computation of the required wavelet coefficients.In this paper, the discrete approximation of Meyer wavelet (D-Meyer) is hence selected as it is a fast algorithm which also supports discreet transformation [17].The results in different scales are shown in Figure 5.
Figure 5 illustrates the three-level decomposition using D-Meyer.We can see from      forecasting system the predicting results matching the reality data exactly.Therefore, multi-scale forecasting system was very effective.

Conclusion
The stock market data are highly random and non-stationary, and thus contain much [7] [8] [9].Different forecasting models can complement each other in capturing different patterns appearing in one-time series, as a combination of forecast outperforms individual forecasting models.In this paper, we construct a hybrid forecasting system on multiple scales.In this system the original data are first decomposed into multiple layers by the wavelet transform, and those layers are divided into low-frequency, intermediatefrequency and high-frequency signal layers.Then autoregressive moving average (ARMA) models are employed for predicting the future value of low-frequency layers; Kalman filters are designed to predict the future value of intermediate-frequency layers; Back Propagation (BP) neural network models are established by the high-frequency signal of each layer for predicting the future value.Finally, those predictions of the future values are restructured and corrected.Furthermore, the empirical data set of Shandong Gold Group of Shanghai Stock Exchange (SSE) closings prices from 28 th June 2011 to 24 th June 2012 is used to illustrate the application of the forecasting system.

ψ
Figure 1.The flow chart of the hybrid forecasting system based on multiple scales.
is the wavelet coefficient, "*" corresponds to the complex conjugate.The wavelet transformation seeks out the level of similarity between the time series data and wavelet function at different scales and translation and generates wavelet coefficient ( ) , f W a b contour map also known as a scalogram.CWT generates large amount of data for all a and b.However, if the scale and translations are chosen based Figure 2, have the highest number of vanishing moments, this family has been chosen for carrying out the wavelet-based multi-resolution analysis of the proposed sequences of data points.

Figure 2 .
Figure 2. Multi-resolution analysis leading to the 2-levelde composition of a signal S.

Figure 5 (Figure 5
Figure 5(a), Figure 5(b) that the detailed scale mainly contains the trend component, Figure 5(c), Figure 5(d) represent most of the weekly periodic components and stochastic components.Figure 5(e), Figure 5(f) represent most of the strongly periodic components and stochastic components.So, A5 and D5 are low-frequency layers, D4 and D3 are intermediate-frequency layers, D2 and D1 high-frequency layers.Time series forecasting can be produced by forecasting on low-frequency, intermediatefrequency and high-frequency signal layers separately.

Figure 7 .
Figure 7.The comparison chart of the actual value and the forecasting results of high-frequency layers: (a) the comparison chart of D4; (b) the comparison chart of D3.

Figure 8 .
Figure 8.The comparison chart of the actual value and the forecasting results of high-frequency layers: (a) the comparison chart of D2; (b) the comparison chart of D1.

Figure 9 .
Figure 9.The comparison chart of the actual value and the forecasting results.
noise.The lack of a good forecasting model motivates us to find an improved method of making forecasts called a hybrid forecasting system based on multiple scales.In this method, the original data are decomposed into multiple layers by the wavelet transform.And the multiple layers were divided into low-frequency, intermediate-frequency and high-frequency signal layers.Then autoregressive moving average (ARMA) models are employed for predicting the future value of low-frequency layers; Kalman filters are designed to predicting the future value of intermediate-frequency layers; and Back Propagation (BP) neural network models are established by the high-frequency signal of each layer for predicting the future value.Real data are used to illustrate the application of the hybrid forecasting system based on multiple scales, and the result of simulation experiments is excellent.