Hybrid ARIMA / RBF Framework for Prediction BUX Index

In this paper, we construct and implement a new architecture and learning method of customized hybrid RBF neural network for high frequency time series data forecasting. The hybridization is carried out using two running approaches. In the first one, the ARCH (Autoregressive Conditionally Heteroscedastic)-GARCH (Generalized ARCH) methodology is applied. The second modeling approach is based on RBF (Radial Basic Function) neural network using Gaussian activation function with cloud concept. The use of both methods is useful, because there is no knowledge about the relationship between the inputs into the system and its output. Both approaches are merged into one framework to predict the final forecast values. The question arises whether non-linear methods like neural networks can help modeling any non-linearities being inherent within the estimated statistical model. We also test the customized version of the RBF combined with the machine learning method based on SVM learning system. The proposed novel approach is applied to high frequency data of the BUX stock index time series. Our results show that the proposed approach achieves better forecast accuracy on the validation dataset than most available techniques.


Introduction
One of the reason computers started to apply in time series modeling was the study of Bollershev [1] where he proved the existence of nonlinearity in financial high frequency data.Over the past ten years statisticians and academics of computer science have developed new forecasting techniques based on probabilistic theory such as the use of Kalman filter, threshold autoregressive models, the ARCH/GARCH family of models, and latest information technologies respectively such as probabilistic or believe networks, soft, neural and granular computing that help to predict future values of high frequency financial data.At the same time, the field of financial econometrics and statistics have undergone various new developments, especially in financial models, stochastic volatility such as models for managing financial risk [2]- [4], methods based on the extreme value theory [5], Lévy models [6], methods to assess and control financial risk, methods based on time intensity models, usage copulas and implementing risk systems [7].
The first techniques of machine learning applied into time series forecasting were artificial neural networks (ANN).As ANN was an universal approximator, it was believed that these models could perform tasks like pattern recognition, classification or predictions [8].Today, according to some studies [9], ANNs are the models having the biggest potential in predicting financial time series.The reason for attractiveness of ANNs for financial prediction can be found in works of Hill et al. [10], where authors showed that ANNs works best in connection with high-frequency financial data.Lately, time series prediction becomes one of the most important aspects of time series Data Mining, which has received a growing attention.While the first application of ANNs into financial forecasting perceptron's network the simplest feed forward neural network was used [11], nowadays it is mainly RBF network [12] that is being used for this as they showed to be better approximators than the perceptron type networks [13].
Firstly, in this article we analyse, discuss and compare the forecast accuracy from models which are derived from competing statistical and neural network specifications.Secondly the hybrid ARCH-GARCH and RBF NN (Radial Basic Function Neural Network) architectures for time series prediction are proposed and their forecasting performance is evaluated and compared with SVM (Support Vector Machines) approach.The aim of the paper is to explain achieved aspects of both statistical and soft computing approaches for quantifying forecast accuracy applied to daily BUX index time series and assess the prediction performance of novel models based on the hybridization of these separate approaches.
The paper is organized as follows.In Section 2 we briefly describe the basic knowledge of ARCH-GARCH models and their variants: EGARCH and PGARCH models.Section 3 presents the data, conduct some preliminary analysis of the time series and demonstrate the forecasting abilities of ARMA (AutoRegressive Moving Average)-ARCH/GARCH models.Section 4 introduces RBF neural networks and proposes a novel Evolutionary RBF Network (ERBFN) that is based on RBF network and ARCH/GARCH models.Section 5 shows the forecasting performance SVM system.In Section 6 we put an empirical comparison.Section 7 concludes the paper and proposes future work.

Some ARCH/GARCH Models for Financial Data
ARCH-GARCH models are designed to capture certain characteristics that are commonly associated with financial time series.They are among others: fat tails, volatility clustering, persistence, mean-reversion and leverage effect.As far as fat tails, it is well known that the distribution of many high frequency financial data have fatter tails than normal distribution.
The first model that provides a systematic framework for volatility modelling is the ARCH model.This model was proposed by Engle [14].Bollerslev [1] proposed a useful extension of Engle's ARCH model known as the generalized ARCH (GARCH) model for time sequence { t y } in the following form where { t v } is a sequence of IID (Independent Identical Distribution) random variables with zero mean and unit variance.i α and i β are the ARCH and GARCH coefficients, h t represents the conditional variance of time se- ries conditional on all the information to time t -1, I t-1 .
In the literature several variants of basic GARCH model (1) has been derived.Nelson [15] proposed the following exponential GARCH model abbreviated as EGARCH to allow for leverage effects in the form The basic GARCH model can be extended to allow for leverage effects.This is performed by treating the basic GARCH model as a special case of the power GARCH (PGARCH) model proposed by Ding, Granger and Engle [16] ∑ where d is a positive exponent, and i γ denotes the coefficient of leverage effects.
Another ARCH-GARCH models as the ARCH-GARCH regression and ARCH-GARCH mean model can be found in [17].

An Application of ARCH-GARCH Models
We illustrate the statistical ARCH-GARCH methodology for daily BUX stock indexes 1 as a proxy to the Hungarian stock market to study the development of forecasting model.The sample period is from January 7, 2004 to December 31, 2012 and has 2255 observations (see Figure 1).This period was chosen purposely to investigate forecasting accuracy during time with a special emphasis on the resolution of behavior in the time during the global financial crisis of 2008-09, and also in post crisis period of 2010-12.To build a forecast model the sample period for analysis (January 2004 to June 2007 so called the training data set) was defined, i.e. the period over which the forecasting model can be developed and the ex post forecast period (July 2007 to December 2012) so called validation or ex post data set.By using only the actual and forecast values within the ex post forecasting period only, the accuracy of the model can be calculated.It is clear from the time plot of both datasets the series are not stationary since their graphs show a trend, but after differencing their become stationary.
The main purpose of time series analysis is to understand the underlying mechanism that generates the observed data, in turn, to forecast future values.Typically, these processes are described by a class of linear models called autoregressive integrated moving average (ARIMA) models.Tentative identification of an ARIMA time series model is done through analysis of actual historical data.The primary tools used in identification process are autocorrelation function (ACF) and partial autocorrelation function (PACF).The theoretical ACF and PACF are unknown and must be estimated by the sample ACF and PACF.
The relevant lag structure of potential inputs was analyzed using traditional statistical tools, i.e. using the autocorrelation function (ACF), partial autocorrelation function (PACF) and the Akaike information criterion (AIC): we looked to determine the maximum lag for which the PACF coefficient was statistically significant and the lag given the minimum AIC.According to these criterions the ARMA model was specified as the ARIMA(1,1,0) process.
High frequency financial data, like our BUX index reflect a stylized fact of changing variance over time.An appropriate model that would account for conditional heteroscedasticity should be able to remove possible nonlinear pattern in the data.Various procedures are available to test an existence of ARCH or GARCH.A commonly used test is the LM (Lagrange Multiplier) test [14].The LM test performed on the BUX time series indicates presence of autoregressive conditional heteroscedasticity.For estimation the coefficients of an ARCH or GARCH type model the maximum likelihood procedure was used.The quantification of the model was performed by means of the R2.6.0 software and resulted into the ARIMA(1,1,0)/EGARCH(1,1,1) model where the ARIMA(1,1,0) process has the form where t y ∆ are first differences of the BUX time series, t ε are reziuals t ε , i.e. independent random variable drawn from stable probability distribution with mean zero and variance h t .. The estimation results of ARIMA(1,1,0)/EGARCH(1,1,1) process for BUX index time series are given in Table 1 and Table 2.
Graph of the fitted and the forecast values for the estimation and ex post periods are presented in Figure 2.

Evolutionary RBF Network
The structure of a neural network is defined by its architecture.In Figure 3 the architecture is depicted for classic RBF NN where each circle or node represents the neuron.This neural network consists an input layer with input vector x and an output layer with the output value t y ˆ.The output signals of the hidden layer are calculated as ( ), where x is a k-dimensional neural input vector, j w represents the hidden layer weights, 2 ψ are radial basis    (Gaussian) activation functions.Note that for an RBF network, the hidden layer weights j w represent the cen- tres j c of activation functions 2 ψ .To find the weights w j or centres of activation functions we used the adap- tive (learning) version of K-means clustering algorithm for s clusters [17].The RBF network computes the output data set as where N is the size of training data samples, s denotes the number of the hidden layer neurons (RBF neurons) and t y ˆ corresponds to the estimated value of BUX index.To improve the abstraction ability of classic RBF neural networks with architecture depicted in Figure 3, we replaced the standard Gaussian activation (membership) function of RBF neurons with functions based on the normal cloud concept (see [18].Then, in the case of RBF network, the Gaussian membership function .)/ (. where n E ′ is a normally distributed random number with mean En (entropy) and standard deviation He (hyper entropy).E is the expectation operator.
Recently, a lot of scholars have developed hybrid forecasting systems [].For example models for financial data systems which focus on parametric structural models, including logit or probit models of warning economic indicators to predict crises [19] [20], and models which utilize techniques of computational intelligence such ANN, fuzzy logic systems and genetic algorithm, artificial intelligence and machine learning [21].
Next, we will combine the RBF neural network according to the architecture depicted in Figure 3 with the statistical ARMA(1,1,0)/EGARCH(1,1,1) model and with estimated coefficients given in Table 1 and Table 2 respectively in one unified framework.The scheme of such proposed hybrid model is depicted in Figure 4.The thought of this proposal consists in the economic theory of co-integrated variables which are related by an error correction model [22].The simple mean Equation ( 4) can be interpreted as the long-run relationship and thus it entails a systematic co-movement between variables t y and 3 t y − .A long-run relationship will often hold "on average" over time [23].If there exists a stable long-run, then error (residual) t ε from the Equation ( 4) should be a useful additional explanatory variable for the next direction of movement of t y .According to [23] this mechanism is called as the error correction mechanism.
The mentioned hybrid model consists of the following components.External inputs which represent input data, in our case the BUX indexes of historical-rates.Input data enter the ARIMA forecasting model which produces one output: the ex-post residuals.These residuals enter together with external inputs the hyrid ERBF NN forecasting model which generates the ERBF NN ex post forecasts.
As can be seen in Figure 4, the proposed hybrid forecasting systems are two or more prediction models.It can be proposed several types of hybrid forecasting systems depending on the combination and utilization of the various components which are included into the system.For example, we investigated also another hybrid system in which the ERBF NN was replaced by the classic neural network of perceptron type with one hidden layer.The architecture of this neural network is similar to the classic RBF NN architecture.The difference is that all processing neurons in the hidden layer have the tanh activation functions and all the weights j r j w v , are adapted using classical Back-Propagation algorithm.

Support Vector Machines Learning
Despite the fact that RBF neural networks possess a number of attractive properties such as the universal approximation ability and parallel structure, they still suffer from problems like the existence of many local minima and the fact that it is unclear how one should choose the number of hidden units.Support Vector Machines (SVM) are learning systems that use a hypothesis space of linear function in high dimensional feature space, trained with learning algorithm from optimization theory that implements a learning bias derived from statistical learning theory.Recently, SVMs have been introduced by Vapnik [24].SVR is an extension of the support vector machine algorithm for numeric prediction.Its decision boundary can be expressed with a few support vectors.When used with kernel functions, it can create complex nonlinear decision boundaries while reducing the computational complexity.Nonlinear Support Vector Regression (SVR) is frequently interpreted by using the training data set { and the weights vector norm w simultaneously.(7) This leads to the optimization problem min are positive slack variables and C is regularization parameter which influences a trade-of between an approximation error and weights vector norm.
Finally, the SVR nonlinear function estimation takes the form where so called kernel trick was applied ) , ( within the formulation of this quadratic programming problem.Note that in the case of RBF kernels, the parameters ε σ , , C are to be considered as additional tuning parameters.

Results and Empirical Comparison
ERBFN and SVM model were trained using the variables and data sets as the ARMA(1,1,0)/EGARCH(1,1,1) model above.In the ERBFN framework, the non-linear forecasting function f(x) was estimated according to the expressions (6) with RB function .)/ (. given by (7).Graphs of the forecast values for validation data sets are presented in Figure 5 (ERBFN model) and Figure 6 (SVM model) respectively.Table 3 presents the accuracy results of three prediction methods.As can be also seen from Table 3, all models are very good and follow the pattern of the actual very closely.The MAPE was very good, measuring approximately 1.6 percent for ARIMA/EGARCH model, 1.2 percent for ERBF network and 1.4 percent for SVM model.
From Table 3 it is shown that all forecasting models used are very accurate.The development of the error rates on the validation data sets showed a high inherent deterministic relationship of the underlying variables.Though promising results have been achieved with both approaches, for the chaotic financial time markets a purely linear (statistical) approach for modeling relationships does not reflect the reality.The hybrid system based on ERBFN not only detected the functionality between the underlying variables and the BUX indexes as well as the short-run dynamics.

Conclusions
In the present paper we proposed two approaches for predicting the BUX time series.The first one was based on the latest statistical ARIMA/ARCH methodologies, the second one was on neural version of the statistical model and SVR.
After performed demonstration it was established that forecasting model based on SVR model is better than ARIMA/ARCH one to predict high frequency financial data for the Malaysia KLCI-Price Index time series.The direct comparison of forecast accuracies between statistical ARCH-GARCH forecasting models and its neural representation, the experiment with high frequency financial data indicates that all investigated methodologies yield very little MAPE (Mean Percentage Absolute Error) values.Moreover, our experiments show that neural forecasting systems are economical and computational very efficient, well suited for high frequency forecasting.Therefore they are suitable for financial institutions, companies, medium and small enterprises.In the future research we plan to extend presented methodologies by applying fuzzy logic systems to incorporate structured human knowledge into workable learning algorithms.

Figure 1 .
Figure 1.Graph of real BUX index.1You can obtain these data from www pages at http://www.global-view.com/forex-trading-tools/forex-history.

Figure 2 .
Figure 2. The actual and fitted values for BUX index-statistical approach (validation data set Jun 2007-Dec 2012).

Figure 4 .
Figure 4.The scheme of the proposed hybrid forecasting model (see text for details).
features (the input data are projected to a higher dimensional feature space).In order to perform SVR one optimizes the cost function

Figure 5 .
Figure 5.The actual and fitted values for BUX index-ERBFN model (the validation data sets Jun 2007-Dec 2012).

Figure 6 .
Figure 6.The actual and fitted values for BUX index-SVM model (the validation data sets Jun 2007-Dec 2012).