Short-Term Financial Time Series Forecasting Integrating Principal Component Analysis and Independent Component Analysis with Support Vector Regression

Financial time series forecasting could be beneficial for individual as well as institutional investors. But, the high noise and complexity residing in the financial data make this job extremely challenging. Over the years, many researchers have used support vector regression (SVR) quite successfully to conquer this challenge. In this paper, an SVR based forecasting model is proposed which first uses the principal component analysis (PCA) to extract the low-dimensional and efficient feature information, and then uses the independent component analysis (ICA) to preprocess the extracted features to nullify the influence of noise in the features. Experiments were carried out based on 16 years’ historical data of three prominent stocks from three different sectors listed in Dhaka Stock Exchange (DSE), Bangladesh. The predictions were made for 1 to 4 days in advance targeting the short term prediction. For comparison, the integration of PCA with SVR (PCA-SVR), ICA with SVR (ICA-SVR) and single SVR approaches were applied to evaluate the prediction accuracy of the proposed approach. Experimental results show that the proposed model (PCA-ICA-SVR) outperforms the PCA-SVR, ICA-SVR and single SVR methods.


Introduction
The endeavor of financial time series forecasting has gained extreme attention from both the individual and institutional investors because the accurate forecasting can influence the decision behind investment.This field is characterized by data intensity, noise, non-stationary, unstructured nature, high degree of uncertainty, and hidden relationships [1].Capital market trend depends on many factors including political events, general economic conditions, news related to the stocks and traders' expectations.Moreover, according to academic investigations, movements in market prices are not random.Rather, they behave in a highly non-linear, dynamic manner [2].Therefore, predicting stock market price is a quite challenging task.
Technical analysis is a popular approach to study the capital market patterns and movement.The results of technical analysis may be a short or long-term forecast based on recurring patterns; however, this approach assumes that stock prices move in trends, and that the information which affects prices enters the market over a finite period of time, not instantaneously [3].Technical indicators used in this analysis are calculated from the historical trading data.Researchers use various machine learning and artificial intelligent approaches to analyze these technical indicators to predict future trends or prices.The traditional statistical models include Box Jenkins ARIMA [4].Continuous research has introduced plentiful approaches including Artificial Neural Networks (ANN), genetic algorithm, rough set (RS) theory, fuzzy logic and others [5] [6].Most of these approaches suffer from different problems like over-fitting or under-fitting, initializing large number of control parameters, finding the optimum solutions etc.
To resolve most of these shortcomings, support vector regression (SVR) has been widely used in various nonlinear regression tasks.This is largely because; SVR uses the structural risk minimization principal for function estimation while the traditional methods implement empirical risk minimization principal.
The first important step in developing an SVR based forecasting model is feature extraction (transforming the original features into new ones) and feature selection (choosing the most influential set of features).Principal component analysis (PCA) is a widely applied feature extraction method in the framework of SVR [14] [15].PCA transforms high-dimensional input vectors into uncorrelated principal components (PCs) by calculating the eigenvectors of the covariance matrix of original inputs.Again, the latent noise residing in financial time series data often leads to over-fitting or under-fitting and hence impairs the performance of the forecasting system.Lu et al. has proposed the use of independent component analysis (ICA) (both linear and non-linear) with SVR to negate the influence of such noise in data in order to improve the forecasting accuracy [16] [17].In both approaches, at first ICA was used to extract the most influential components from the technical indicators and then were fed to SVR for better prediction purpose.ICA is a signal processing technique that was originally developed for blind source separation.It attempts to achieve statisti-cally independent components (ICs) from the transformed vectors.Cao et al. has shown that both PCA and ICA can improve the performance of SVR in time series forecasting [18] which motivated this research work to adopt PCA and ICA with SVR for predicting future stock prices.
In this paper, an SVR based forecasting model is developed integrating both PCA and ICA to elevate the prediction accuracy for stock prices because even a small improvement of this performance can have a significant influence on investment decisions.Considering the fact that, technical analysis plays a vital role in the forecasting, it has been conducted to calculate technical indicators as the input features.Then PCA is used to extract the influential components from input features which are then filtered to transform the high-dimensional input into low-dimension features.After that, ICA is applied to convert the reduced features into independent components.The SVR then finally uses the filtered and transformed low-dimensional input variables to construct the forecasting model and predict stock prices for 1 to 4 days in advance.The predictive performance of the proposed approach is compared with three traditional approaches: the integration of PCA with SVR (PCA-SVR), ICA with SVR (ICA-SVR) and single

SVR.
The reminder of this paper is organized into 6 sections.Section 2 provides a brief overview of the methodologies used in this study which includes PCA, ICA and SRV.Section 3 introduces the proposed method.Section 4 describes the research data.Section 5 reports the experimental results obtained from the study.
Finally Section 6 contains the concluding remarks.

Principal Component Analysis (PCA)
Principal component analysis (PCA), invented by Karl Pearson [19], is a well-known statistical procedure for feature extraction.It finds smaller number of uncorrelated components from high dimensional original inputs by calculating the eigenvectors of the covariance matrix.Given a set of m dimensional input vectors  .PCA is a trans- formation of x i into a new vector y i by: where U is the m × m orthogonal matrix whose jth column u j is the jth eigenvector of the sample covariance matrix . In other words, PCA solves the eigenvalue problem of Equation (2)., 1, 2, , where j λ is one of the eigenvalues of C. u j is the corresponding eigenvector.
Based on the estimated u j , the components of y i are calculated as the orthogonal transformation of x i : The new components are called principal components.By using only the first several eigenvectors sorted in descending order of the eigenvalues, the number of principal components in y i can be reduced [20].Thus, PCA can be used to reduce dimensions where the principal components are uncorrelated and have sequentially maximum variances.

Independent Component Analysis (ICA)
ICA is basically a signal processing technique that regains mutually independent but unknown source signals from their mixture without having any prior knowledge of the mixing mechanism. Let ICA the above mixing model can be rewritten as [21]: where a i is the i th column of the m × m unknown mixing matrix A and s i is the ith row of the m × n source matrix S.
where y i is the i th row of the matrix Y,  .The rows of Y are called the independent components (ICs) and are required to be statistically as independent as possible.Here, statistically independence means that the joint probability density of the components of Y is equal to the product of the marginal densities of the individual components.If the un-mixing matrix W is the inverse of the original mixing matrix A i.e.W = A −1 , the latent source signals (s i ) can be estimated using the ICs (y i ).For the identification of Equation ( 5), one fundamental requirement is that all the ICs, with the possible exception of one component, must be non-Gaussian.
Several algorithms have been developed to perform ICA modeling [22] [23] [24] [25] [26].The FastICA algorithm proposed by [27] is adopted in this research work where the mutual information is used as criteria to estimate Y. Minimizing mutual information between components implies maximizing their negentropy.The negentropy is always non-negative and is zero if and only if y has a Gaussian distribution.In the FastICA algorithm, the approximation of the negentropy is using the following contrast function: where v is a standardized Gaussian variable and G is a non-quadratic function.G is given by Two preprocessing steps are applied to the input matrix x to simplify the Fas-tICA algorithm, centering and whitening [27].First, x is made zero mean by subtracting its mean i.e.

( ) (
) x .The second step is to whiten x by passing it through a whitening matrix V, i.e., Z = VX.The rows of the whitened variable Z, denoted by z, are uncorrelated and have unit variance, i.e., E{zz T } = I.

Support Vector Regression (SVR)
The SVR extends the basic principles of Vapnik's support vector machines (SVM) [28] for classification by setting a margin of tolerance in approximation and up until the threshold ε, 0 error is considered.Given a training set ( ) , , 1, 2, , , where the x R is the m-dimensional input vector and i R ∈ y is the response variable.SVR generates the linear regression function in the form: Vapnik's linear ε-Insensitivity loss (error) function is: x w (8) Based on this, linear regression ( ) , f x w is estimated by simultaneously minimizing 2 w and the sum of the linear ε-Insensitivity losses as shown in Equation (10).The constant c controls a trade-off between an approximation error and the weight vector norm w , is a design parameter chosen by the user.

( ) (
) Minimizing the risk R is equivalent to minimizing the following risk under the constraints mentioned in Equations ( 11)-( 13).
( ) Here, ξ i and * i ξ are slack variables, one for exceeding the target value by more than ε and other for being more than ε below the target.As used in SVM, the above constrained optimization problem is solved using Lagrangian theory and the Karush-Kuhn-Tucker conditions to obtain the desired weight vector of the regression function.

SVR maps the input vectors
performs the mapping ϕ(.).The most popular kernel function that is used in this study is Radial Basis Function (RBF) as shown in Equation ( 14).

2
, exp where γ is the constant of the kernel function.The RBF kernel function parameter γ and regularization constant C are the design parameters of SVR.

Proposed PCA-ICA-SVR Forecasting Model
The three stage methodology named PCA-ICA-SVR proposed in this research scheme is depicted in Figure 1.In the first stage we used PCA to the input data to extract features which were then reduced into a low-dimensional feature space.Then ICA was applied to these reduced feature space to extract independent components.Finally, these independent components were used in the SVR for constructing the forecasting model.
First of all, technical analysis is conducted on the dataset and 29 technical indicators (TIs) are calculated that are being used by financial experts [3].Some important technical indicators and their formulas are shown in Table 1.All values of these constructed features are scaled into the range of [0, 1] to eliminate the biasness towards larger value attributes.Then PCA is applied to the normalized data to extract the PCs containing the most influential information.These     15)-( 18) [30].These are the measures of deviation between actual and predicted prices.The smaller the values of these measures, the closer the predicted prices are actual prices.They can be used to evaluate the predictive performance of any forecasting model.
( ) where A t is the actual value, F t represent the predicted value and n is the total number of data points.

Research Data
To conduct the study and evaluate the performance of the proposed approach,

Experimental Results
The principal component analysis on the original data shows that the first 10   In this study, the radial basis function (RBF) is used as the kernel function of SVR.To find the best C and γ value pair we have considered e −5 to e 10 for both parameters as our research space.For the data of Square Pharmaceuticals Limited, the coarse grid discovered the best (C, γ) as (e 9 , e 3 ) with the 5-fold cross validation MAPE 2.23%.Then a finer grid search on the neighborhood of (e 9 , e 3 ) produced a better cross-validation MAPE of 1.66% at (e 9 , e 2.8 ).After the best (C, γ) is found the whole training set is trained again to generate the final SVR mod- el.The best value pairs for C and γ for every prediction task where minimum prediction error is exhibited by the grid search approach are shown in Table 2.
The    15, we can discover that the proposed PCA-ICA-SVR method outperforms other three methods under all four different relative ratios for all three target stocks.It therefore concludes that PCA-ICA-SVR approach clearly produces less forecasting error than other three approaches.This demonstrates the effectiveness of our proposal.
amount that a security's price has changed over a given time span.
the 16 years' historical data of daily transaction for the time period from January 2000 to December 2015 are collected from Dhaka Stock Exchange, Bangladesh (http://www.dsebd.org/).This data covers 3600 trading days and each data comprises five attributes: open price, high price, low price, close price and trade volume.We have considered three companies from three different sectors: Square Pharmaceuticals Limited, AB Bank Limited and Bangladesh Lamps Limited as these are the most prominent stocks in DSE.The daily closing prices of these companies are shown in Figures2-4respectively.The first one is a leading company in pharmaceuticals sector, the second leads the banking sector and the last one belongs to the engineering sector.70% of the total sample points (around 2520 trading days) are used as the training sample and the remaining 30% of the total sample points (around 1080 trading days) are holdout to be used as the testing sample.

Figure 2 .
Figure 2. Closing prices of square pharmaceuticals limited.

Figure 3 .
Figure 3. Closing prices of AB bank limited.

Figure 4 .
Figure 4. Closing prices of Bangladesh lamps limited.

Figure 5 .
Figure 5. Cumulative covariance of PCs for AB bank limited.
The goal of the ICA is to estimate the un-mixing matrix W of size m × m that is used to transform the observed mixture signals X to yield the independent signals Y such that

Table 1 .
Important technical indicators and their formulas.Stochastic K% Stochastic %K.It compares where a security's price closed relative to its price range over a given period.

Table 2 .
Grid search results for RBF kernel parameters.

Table 3 .
Prediction performance for 1 day ahead of Square Pharmaceuticals Limited.

Table 4 .
Prediction performance for 2 days ahead of Square Pharmaceuticals Limited.

Table 5 .
performance for 3 days ahead of Square Pharmaceuticals Limited.

Table 6 .
Prediction performance for 4 days ahead of Square Pharmaceuticals Limited.

Table 7 .
Prediction performance for 1 day ahead of AB Bank Limited.

Table 8 .
Prediction performance for 2 days ahead of AB Bank Limited.

Table 9 .
Prediction performance for 3 days ahead of AB Bank Limited.

Table 10 .
Prediction performance for 4 days ahead of AB Bank Limited.

Table 11 .
Prediction performance for 1 day ahead of Bangladesh Lamps Limited.

Table 12 .
Prediction performance for 2 days ahead of Bangladesh Lamps Limited.

Table 13 .
Prediction performance for 3 days ahead of Bangladesh Lamps Limited.

Table 14 .
Prediction performance for 4 days ahead of Bangladesh Lamps Limited.
pared in terms of MAPE (%) and rRMSE (%) with four relative ratios, 60%, 70%, 80%, and 90% of training sample size with respect to the complete dataset size.Predictions are made for the closing price of the target stock for next trading day.Table15summarizes the prediction results for Square Pharmaceuticals Limited, AB Bank Limited and Bangladesh Lamps Limited.Based on the findings in Table

Table 15 .
Robustness evaluation of PCA-ICA-SVR, ICA-SVR, PCA-SVR and single SVR with different relative ratios for Square Pharmaceuticals Limited, AB Bank Limited and Bangladesh Lamps Limited.from the original data and uplift the performance of SVR in stock price forecasting.As the proposed model helps to predict stock prices with less error, investors can use this to gain more profit or obtain less loss in stock market.Again, this proposed approach can also be used in other domains like weather forecasting, energy consumption forecasting or GDP forecasting.Future research may integrate Kernel PCA, non-linear ICA and other signal processing techniques like wavelet transformation with SVR to further enhance the forecasting performance.This study mainly focuses on short-term price prediction.Its applicability might be investigated for long-term forecasting in future and appropriate methods could be integrated to enhance performance in future.However, only the price related historical data is used here to predict future prices.But, it is well known that various other aspects like general economic conditions, government policies, company performance, investor's interest etc. also play vital roles in stock market.In future, these aspects can also be incorporated as input features for prediction which may buttress the accurate prediction.
sults show that the proposed PCA-ICA-SVR model outperforms all three other methods by generating less predictive errors.The empirical results can conclude that the PCA and ICA, working together, can successfully unfold the influential information