A Comparative Study of Support Vector Machine and Artificial Neural Network for Option Price Prediction

Option pricing has become one of the quite important parts of the financial market. As the market is always dynamic, it is really difficult to predict the option price accurately. For this reason, various machine learning techniques have been designed and developed to deal with the problem of predicting the future trend of option price. In this paper, we compare the effectiveness of Support Vector Machine (SVM) and Artificial Neural Network (ANN) models for the prediction of option price. Both models are tested with a benchmark publicly available dataset namely SPY option price-2015 in both testing and training phases. The converted data through Principal Component Analysis (PCA) is used in both models to achieve better prediction accuracy. On the other hand, the entire dataset is partitioned into two groups of training (70%) and test sets (30%) to avoid overfitting problem. The outcomes of the SVM model are compared with those of the ANN model based on the root mean square errors (RMSE). It is demonstrated by the experimental results that the ANN model performs better than the SVM model, and the predicted option prices are in good agreement with the corresponding actual option prices.


Introduction
The financial market may be regarded as the propellant of any country's economy. However, the relationship between the currency market and the country's Journal of Computer and Communications economy is really complicated. Identifying this relationship is one of the most important parts of any money investment decision making framework [1] [2] [3]. In this context, derivatives such as option became very significant part of the financial market over the past few decades. Option is a financial contract between two parties that provides the buyer (the owner or holder) of the option the right (but not an obligation) to buy or sell the underlying asset for the settled price (strike price) on or before expiring time (maturity time) of the contract, depending on the form of the option. There exist two fundamental types of options namely, call option (option for buying) and put option (option for selling).
The seller and buyer can protect their financial risk with the help of option contract. For this reason, the problem of option price prediction has received considerable attention from scientific community. It is important to predict option price to know rigorously the future trends of financial market. However, accurately forecasting option price is a major challenge in stock market as it follows a complex pattern and shows stochastic behavior. In addition, it has been pointed out that forecasting option value is dynamic, sophisticated and chaotic in nature [4]. Thus, the study on the option price prediction is very worthy.
Several researchers have worked out to predict option value by adopting some ancient and innovative techniques. Examples of such techniques include movingaverage (MA), regression (R), auto-regression (AR), AR moving-average (ARMA), and AR integrated moving-average (ARIMA). In these techniques, the correlated data is used in the process and different types of assumption are required for different parametric specifications, and consequently, the standard of the prediction results degrades [5]. In addition, these models are not capable of handling non-stationary time series data. Thus, it is essential to develop updated models with higher capacities for accomplishing the task of forecasting. Soft computing techniques can be used in this regard that covers mimic biological processes. These techniques include Artificial Neural Network (ANN), Numerical Rationale (NR), Support Vector Machine (SVM), Molecule Swarm Improvement (MSI), etc. Among these models, ANN and SVM models have been widely used in a variety of fields of science and technology, including prediction problems.
The objective of this paper is to carefully examine, compare and analyze the performance of two highly promising and frequently used soft computing techniques of SVM and ANN for predicting option price. For this reason, both techniques are first evaluated individually and the predicted results are compared with the actual results. Then, the results of both techniques are compared with each other. The experimental outcomes indicate that the ANN model shows better performance than the SVM model for the prediction of option price. The rest part of this paper is organized as follows. The next section (Section 2) reviews some related work. A brief introduction of NN, ANN and SVM models is presented in Section 3. Section 4 contains the details of the dataset and the methodology to accomplish the task of this paper. The experimental results and the discussion of the results with comparison are reported in Section 5. And the final section (Section 6) offers the conclusion of the paper.  [22]. Its performance was compared with SVM and Deep Belief Network (DBN) and found a better forecast than both SVM and DBN.
Lekhani proved by experiments that ANN is better and more accurate model than the Support Vector Regression (SVR) model for the stock prediction [23]. Recently, Madhu et al. adopted various kernels in the SVR to predict option price [24]. Their experiments illustrated that the SVR with Gaussian Kernel performs well compared to other kernel functions.

Methods of Study
In this section, we give a brief introduction of the methods under consideration in this study. The Neural Network (NN), ANN with biological NN and SVM models are briefly discussed in the following subsections consecutively.

Neural Network
The term Neural Network (NN) can be specified as a logical model, which is designed based on the human brain. The human brain contains interconnected nerve cells named neurons. In fact, the human brain holds about 10 billion neurons and 60 trillion connections, synapses, between them. A nerve cell or neuron consists of three modules-the summing function, the activation function, and the output. The term "Neural" comes from the "neuron" or nerve cells, the basic functional unit of the human (animal) nervous system that exist in the brain and other parts of the human (animal) body. There are mainly three parts in a typical nerve cell or neuron of a human brain such as dendrite, cell body and axon.
There is also another important part called Synapses. The definition of each part of a neuron is given below: In general, NN is a highly interconnected network of billions of neurons with trillion of interconnections between them which influence to run the human body. A typical neuron with its different parts is shown in Figure 1.

Artificial Neural Network with Biological Neural Network
The dendrites of the biological NN are analogous to the weight inputs based on their synaptic interconnection in the ANN. The cell body is analogous to the artificial neuron unit in the ANN which comprises with the summation and threshold unit. On the other hand, the axon carries the output which is also analogous to the output unit in the case of ANN. Therefore, the ANN model is worked  The ANN is a biologically enthused network of artificial neurons, which is executed on a computer basis to perform certain tasks such as clustering, classification, pattern detection, etc. In fact, the architecture of ANN is designed based on the approach and act of the human brain's neurons. The ANN contains nonlinear and non-parametric units which process information, knowledge, intelligence, instruction etc. It is a computational method intended by the study of the brain and nervous system. The ANN follows the structure and operations of the three-dimensional lattice of network among brain cells. The network learns gradually by smoothing the connections between electronic neurons in its system.
The learning process of the network can be deliberated like as a child learns to identify patterns, shapes and sounds, and discerns among them. For example, the child has to be illuminated to a number of examples of a particular type of animals for her to be skilled to recognize that type of animal later on. In addition, the child has to be irradiated to different types of animals for her to be capable to differentiate among animals. There are many different kinds of ANN architectures and several algorithms for network training. The choice of the ANN model depends on the prior knowledge of the system to be modeled. A feed forward neural network with one hidden layer is adopted in this study to forecast the option price in the stock market.

Support Vector Machine
The SVM was first applied by Vladimir N. Vapnik and A. Y. Chervonenkis in the year of 1963 [24]. It is a classifier of supervised learning, also known as a support vector network. The SVM was originally designed for classification, regression and outlier detection; however, later it has expanded in other directions.
Indeed, it is a classifier derived from the theory of statistical learning based on  In the linear SVM model, each input data is plotted as a point in the n-dimensional space where n is input dimensions. After that, the classification task is accomplished by getting the hyperplane which differentiate the data into two classes. Figure 3 represents the SVM margin and hyperplane with trained samples classes. Let us consider a linear classifier (or, hyperplane) [24]: In the above equation, x represents the input feature vector of the classifier, w indicates the weight vector, w T is the transpose of the weight vector, and b holds for the hyperplane position. The linear Equation (1)  ⋅ + = − is the margin of this hyperplane. By using the formula to calculate the distance between two straight lines, we get the following margin: On the other hand, the SVM model performs a classification task for nonlinear problems by adopting the kernel function. In this case, the original input vector projects into the higher dimensional feature space in a nonlinear manner.
After transforming data into the new higher space, the new space is searched for a linear separating hyper-plane. To get a nonlinear SVM regression model, the

Data and Methodology
To conduct experiments, we collect data from the Yahoo Finance community In the above equation, parameter i E illustrates the error of the prediction value of an option for input i, i M denotes the market value of that option, and i P represents the predicted option price. We apply the formula (Equation (3)) for calculating the predicting error for each option. Since the errors can be either Journal of Computer and Communications positive or negative, we use the square amount of these predicting errors. Therefore, the formula for enumerating RMSE for whole data is presented in Equation (4). The purpose of this study is to explore the best learning method between ANN and SVM for option price prediction. For both models, the input dataset is divided into two parts-the training dataset and the testing dataset. In fact, the training dataset contains 70% of the data and the remaining 30% of the data are considered for testing. The dataset is transformed to get relevant attributes according to the input format of ANN and SVM. We use experiments through the trial and error method to get the minimum RMSE for each model. To get the output, we use some input variables which are listed in Table 2  value. Learning rate is also selected based on the result of the smallest RMSE. For minimizing the prediction errors, we use weight vectors randomly with input variables. In addition, we use different types of activation functions in ANN model to minimize the prediction errors. Architecture of comparing the performance of ANN and SVM models for predicting option price is illustrated in Figure 4.

Experimental Results and Discussion
For learning compositions of models ANN and SVM, we partition the information into two sections by using cross-validation, training data and testing data.
Cross-validation is castoff since it ended up standard procedure in practical terms. By using the optimized parameter (displayed in Table 3 and Table 4), both models are tested in the testing phase for forecasting option price in the stock     Table  5. From Table 5, it can be seen that both models find the results close to the actual results. It can be also seen that the ANN model predicts the option price more accurately than the SVM model. In fact, the computed RMSE of the ANN model in this testing phase is 0.274418, which is significantly better than the 0.409254 of the SVM model. In order to facilitate observation, the comparison of the predicted option price with the actual option price for both ANN and SVM models is illustrated graphically. This is shown in the form of scatter plot in Figure 5. From the comparison scatter plot, one can easily get an idea intuitively about the superior performance of the ANN model than the SVM model in this specific prediction problem. Therefore, it can be concluded that the ANN model might be considered as an alternative of the SVM model, which can show promising performance to predict the option price. Figure 5. Scatter plot between actual and predicted price for both SVM and ANN models.