Application of SVR Models in Stock Index Forecast Based on Different Parameter Search Methods

Stock index forecast is regarded as a challenging task of financial time-series prediction. In this paper, the non-linear support vector regression (SVR) method was optimized for the application in stock index prediction. The parameters (C, σ) of SVR models were selected by three different methods of grid search (GRID), particle swarm optimization (PSO) and genetic algorithm (GA).The optimized parameters were used to predict the opening price of the test samples. The predictive results shown that the SVR model with GRID (GRID-SVR), the SVR model with PSO (PSO-SVR) and the SVR model with GA (GA-SVR) were capable to fully demonstrate the time-dependent trend of stock index and had the significant prediction accuracy. The minimum root mean square error (RMSE) of the GA-SVR model was 15.630, the minimum mean absolute percentage error (MAPE) equaled to 0.39% and the correspondent optimal parameters (C, σ) were identified as (45.422, 0.012). The appreciated modeling results provided theoretical and technical reference for investors to make a better trading strategy.


Introduction
Stock index forecast is a non-linear dynamic system.There are many factors affecting the stock index, which goes with the complex fluctuation [1].It had become a popular and interesting research issue to calculate the stock index to avoid the investment risk [2].K-curve analysis was successfully applied to pre-dict the trend of stock prices [3], but it couldn't accomplish the quantitative calculated.To quantitatively forecast the stock index price, traditional time-series models were introduced, such as autoregressive moving average model, which still failed in non-linear and non-stationary prediction [4].At present, the research methods for stock index prediction vary from time series to artificial intelligence.
A variety of machine learning methods had been applied to stock index forecasting.They perform excellently with their merits of self-organization, selflearning and nonlinear approximation [5].Support vector regression (SVR) is a kind of the simple, global optimization machine learning method, mainly described as nonlinear mapping transforming the low-dimensional data into a high-dimensional space, so that the data can be explained by a set of linear functions [6].Improvement of SVR models depends on parameter optimization (regularization parameter C) and the selection of kernel functions [7] [8].In this paper, the SVR models were trained by optimizing C and the kernel function.
The radial basis function (RBF) was selected as the kernel.An over-large value of C possibly reduces the prediction ability of SVR models, and RBF kernel width (σ) commonly influences the model complexity [9] [10].
CSI 300 index is a capitalization-weighted stock market index designed to replicate the performance of 300 stocks traded in the Shanghai and Shenzhen stock exchanges.We established SVR calibration models to predict the opening price of CSI 300 index.In order to find out the optimizational modeling parameter combination of (C, σ), we tried to respectively utilize grid search (GRID) method, particle swarm optimization (PSO) and genetic algorithm (GA) for parameter selection.Furthermore, the opening price of CSI 300 index was predicted using the best parameters of the SVR model with GRID (GRID-SVR), the SVR model with PSO (PSO-SVR) and the SVR model with GA (GA-SVR).The rest of the paper is organized as follows.The methodology demonstrated in Section 1, Section 2 illustrated the SVR modeling process and prediction results.Section 3 contains the conclusions.

Data Acquisition and Pretreatment
Daily trading data of CSI 300 index was scraped from the Wind Financial Terminal One-Stop Platform.The daily trading data (January 4, 2013 to November 30, 2016) were selected as sample set (a total of 949-day data).The data originally includes eight variables, which are opening price, ceiling price, the lowest price, closing price, charge rate, volume, turnover and the margin balance of China stock markets.We further calculated the 5-day average charge rate, the 20-day average charge rate, the 5-day average volume and the 20-day average

Model Evaluation Indices
Four fifths of the total samples were selected for training, and rest one fifth for testing.The calibration models were practiced using the training samples and the parameters are optimized.Then, the training models with their parameters were applied to predict the opening price of the test samples.The prediction performance was evaluated using the root mean square error (RMSE) and mean absolute percentage error (MAPE).The formulas are as follows: ( ) where t y represents the real opening price, ˆt y represents the predictive value of the opening price and n is total number of sample.

SVR Modeling and Parameter Search Method
Stock index prediction requires establishing an optimal prediction function based on stock history data and other interference to calculate the stock index price and to reveal the trend of the stock index.The function is defined as follows: ( ) , , , where y t+1 represents the next-day opening price and 1 2 , , , t x x x  are input samples.
The SVR algorithm is used to estimate the function.The input data is mapped onto a high-dimensional feature space using RBF kernel.SVR is formulated as minimization of the following optimization problem, The optimal parameters (C, σ) are determined by searching the minimum MSE in this dynamic network.
In the PSO search process, a group of particles (random solutions of C and σ) were randomly initialized [11] [12].And the optimal solution was found by iterations, with the training result (i.e. the minimum MSE) as the fitness value.In each iteration, the velocity and position of the particle swarm were globally and individually updated by searching the minimum MSE's.They were renewed by the following iterative equation, where d x represents the position of the particle, d v represents the velocity of the particle, t is the number of iteration, i is the number of particles and ω is the velocity weight, c 1 and c 2 are learning factors.In this study, c 1 and c 2 were both valued 2, ω valued 0.5.The globally-optimal and individually-optimal velocity and position of the particle swarm were found by 100 iterative computations, so that the correspondent optimized parameters (C, σ) are determined.
In the GA search process, a population of chromosomes was randomly initialized [13] [14] [15].And a new population was generated by selection, crossover, and mutation.The candidates were evolved toward better solutions with the MSE selected as the fitness value for iterative calculation.The evolution terminates when the maximum number of generations reached 100.The best SVR training model was obtained with the optimal parameters (C, σ).
The GRID, PSO and GA methods were respectively applied for SVR parameters optimization.The flow charts of the experimental algorithms are shown in Figure 2.This figure depicted the entire modeling process.

Results and Discussion
The trading data of CSI 300 index from January 4, 2013 to November 30, 2016 In the GA-SVR parametric tuning, a group of candidate solutions was randomly initialized, and reiteratively evolved to a more appreciate alternative group of solutions by genetic selection, crossover, and mutation.Figure 5 showed the MSE for each step of iteration.After 99 iterations, the minimum MSE found as In summary, the optimally selected parameters (C, σ) of the SVR models were determined respectively for GRID, PSO and GA optimalization.The GRID-SVR, PSO-SVR and GA-SVR calibration models with their corresponding optimal values of (C, σ) were applied to predict the validation samples.The parameters and prediction results are both presented in Table 1.As is shown in Table 1, the

Conclusion
In this study, the SVR models with GRID, PSO and GA parametric optimization were applied to predict the opening price of CSI 300 index.The optimal parameters (C, σ) were selected as (256, 0.008), (60.576, 0.010) and (45.422, 0.012) for GRID-SVR, PSO-SVR and GA-SVR models, respectively.The optimized SVR models were applied to the validation samples, obtaining the predictive RMSE's in the range of (15.63, 17.96), and the MAPE's ranged from 0.39% to 0.47%.The results showed that the GRID-SVR, PSO-SVR and GA-SVR calibration models were feasible to predict the short-term trend of opening price, and the GA-SVR had the most accurate prediction.The modeling performance provided theoretical and technical reference for investors to make a better trading strategy.
volume as four new indicators to identify the index-changing trend.SVR models were established by the 949 samples with 12 variables to predict the opening price of the next day.The CSI 300 index daily opening price is shown in Figure 1.The sample set was normalized to the [0, 1] before establishing calibration models and the SVR prediction results were renormalized.

Figure 1 .
Figure 1.The CSI 300 index daily opening price.
the vector-form coefficients, i ξ and * i ξ represent the relaxation factors.This optimization formulation can be transformed into the dual problem, and its solution is given by controlled by C, K(x, x * ) is a RBF ker- nel, σ represents the kernel width.C and σ are tuned respectively in the GRID-SVR, PSO-SVR and GA-SVR models respectively, to test the predictive capabilities.The best parameter combination of (C, σ) is determined according to the minimum mean square error (MSE) under the 5-fold cross validation.The MSE is defined as follows, search process, C and σ were pre-set in the tuning range of constitute a two-dimensional dynamic network.

Figure 2 .
Figure 2. The flow charts of the experimental process.

Figure 5 .
Figure 5. MSE of SVR optimization with GA evolution.

Figure 6 .
Figure 6.Comparison of the real opening price and the GA-SVR predicted opening price.

Table 1 .
Comparison of the predictive results for GRID-SVR, PSO-SVR and GA-SVR models.