Handwriting Classification Based on Support Vector Machine with Cross Validation

Support vector machine (SVM) has been successfully applied for classification in this paper. This paper discussed the basic principle of the SVM at first, and then SVM classifier with polynomial kernel and the Gaussian radial basis function kernel are choosen to determine pupils who have difficulties in writing. The 10-fold cross-validation method for training and validating is introduced. The aim of this paper is to compare the performance of support vector machine with RBF and polynomial kernel used for classifying pupils with or without handwriting difficulties. Experimental results showed that the performance of SVM with RBF kernel is better than the one with polynomial kernel.


Introduction
The field of handwriting has been of interest from a variety of aspects; its entity, indications and aesthetic.In the beginning, the development of handwriting and the factors that affect handwriting performance were investigated [1,2], but later whole words were addressed.Most of the systems reported in the literature until today involved screening measures in identifying pupils who are at risk of handwriting difficulties and also addressed the absence of an appropriate tool for monitoring beginning handwriting development.More importantly, automated handwriting analysis has been given more attention in the hunt for quantitative features and key indicators in monitoring beginning handwriting skill development.Such automated handwriting analysis include recognizing the writer (e.g.[3]), the text written (e.g.[4]), movement and procedure (e.g.[5,6]), or even semantic content of the text (e.g.[7]).More or less each of these issues can, and have been investigated either offline or online related to the available data.
Up to sixty percent of children's typical school day is allocated to fine motor activities, with writing being the predominant task during these time periods [8].These tasks all require the foundational skill of basic handwriting proficiency to allow teachers to accurately assess students' understanding and comprehension of instructional material.If students do not possess basic handwriting proficiency, it can limit their ability to successfully complete a majority of classroom tasks.In addition, it has also been suggested that students with handwriting problems need to focus more attention on the physical process of writing, thus limiting use of higher order cognitive skills, planning and generation of content [9].Thus, handwriting proficiency is an important foundation upon which success with later writing tasks depends.Due to the number of every day school tasks which involve writing, unsuccessful mastery of handwriting skill can negatively influence later success in school.

Support Vector Machine
Support Vector Machine (SVM) is a new classification technique based on the statistical learning theory proposed by Vapnik in 1995 [10].It can successfully solve over-fitting, local optimal problem and is especially suitable for small-sample and high-dimensional nonlinear case.Besides, it already showed good results in the medical diagnostics, optical character recognition, electric load forecasting and other fields.

Kernel Fuction
In general, a radial basis function is one of the most popular kernel and reasonable first choice.The reason why is, this kernel nonlinearly Given the linearly separability sample set (x i , y i ) where i = 1,…, n.If taking the simplest case; 2 class classification, then x∈R n , y ∈ { + 1, − 1} is the classes number.The commonly form of the linear decision function is: Sometimes linear classifiers are not complex enough; therefore SVM maps the data into a higher dimensional space, unlike the linear kernel which can handle the case when the relation between class labels and attributes is nonlinear [11].Formally, pre-process the data with: and then learn the map from φ(x) to y: However, the dimensionality of φ(x) can be very large, making w hard to represent explicitly in memory, and hard to solve.The Representer theorem (Kimeldorf & Wahba, 1971) shows that (for SVMs as a special case): 1 ( ) for some variables α.Instead of optimizing w directly we can thus optimize α.The decision rule is now: If the dot product (x.x i ) is replaced by the kernel function K(x, x′), the optimal decision function is as follows: In this project, 2 kinds of common kernel function are used.The first one is Gaussian radial basis function (RBF): 2 2 ( , ) exp( ) and the other one is polynomial kernel:

  ( , )
. 1 Classical techniques utilizing radial basis functions employ some method of determining a subset of centre.Typically a method of clustering is first employed to select a subset of centre.An attractive feature of the SVM is that this selection is implicit, with each support vectors contributing one local Gaussian function, center at that data point.

Cross Validation (CV)
Currently, cross-validation has been widely used for estimating the performance of neural networks and other applications such as support vector machine and k-nearest neighbor.Cross-validation is a statistical method of evaluating and comparing learning algorithms.The basic idea of cross-validation is splitting the data, which is consists of dividing the available training data into two sets.The first set is used to train the network, while the other is used to evaluate the performance of the trained network.In typical cross-validation, the training and validation sets must cross-over in successive rounds such that each data point has a chance of being validated against.The basic form of cross-validation is k-fold cross-validation.Other forms of cross-validation are special cases of k-fold cross-validation or involve repeated rounds of k-fold cross-validation.
Advantages of this method are as follows: 1) Average classification accuracies of k SVM classifiers are used to evaluate the SVM classifier parameters performance which can improve the generalization ability of the SVM classifier with the optimized parameters; 2) k-fold cross-validation method can ensure all the sample data be involved in the SVM classifier training and validation, it can make full use of the limited sample data; 3) no matter how the data gets divided, every data point is used as a test set exactly once, and gets to be in a training set k-1 times.The disadvantage of this method is that the training algorithm has to be rerun from scratch k times, which means it takes k times as much computation to make an evaluation.

Methodology
The data was obtained from Khalid et al in [13].The data is composed of 120 samples which contain 2 features (that is The standard deviation of pen pressure when drawing RU, p-value < 0.0001 and z-value = minus 4.319 and Ratio of time taken to draw HR and HL, p-value < 0.0001 and z-value = minus 5.205.) and two group of writers (that is below average printers (test group) and above average printers (control group)).
Firstly, the data is portioned into k equally sizes segments or folds.In this project, we used 10-fold cross validation (k = 10) as it is the most common used for data mining and machine learning.As shown in Figure 1, the darker section of the data are used for training while the remaining data; lighter sections are used for validate the model.This process is repeated 10 times until all sections have been validated.

Model Parameter Selection
Two models; SVM of polynomial kernel function and RBF kernel are chose in looking for performance comparison.Performance of the SVM depends on the choice of parameters.The optimal selection of these parameters is a nontrivial issue.According to study, the important of RBF kernel is need to find parameter C and g.SVM of polynomial kernel function chooses different parameter C and d.The penalty factor C, is used to improve generalized capability when C is increasing while g and d are the adjustable parameter of study machine in the experiment and they are used to adjust experienced error value.The parameter slightly influences classification result when a smaller amount of training samples are used [12].
After training SVM, the best value C and g can be used to classify children with handwriting problems.For  the SVM with polynomial kernel, there are two parameters: C and d.The SVM with RBF kernel has also two parameters: g and C. In order to know different performance each parameter produces to outputs, we select three values for each parameter just like choosing the number of hidden nodes in the neural networks.

Results and Discussion
Table 1 and Table 2 present the recognition results using the SVM with polynomial kernel, RBF kernel respectively.The classification was considered correct if the output from the model was similar to the one that had been judged by the teachers (using Handwriting Proficiency Screening Questionnaire (HPSQ)).In this paper, we used the classification error (rejection of genuine category) as the metric.According to Table 1, as can be seen the percentage of correct prediction of feature 1 is in decreasing when the variation g varies from 0.01 to 0.1.While it is in reverse direction when the variation g varies from 0.1 to 1.The results confirmed that the best value of the variation g near 0.1.When the coefficient of penalty C is increased, the accuracy of prediction is in decreasing.Different from feature 1, feature 2 is seen to be decreasing in percentage of correct prediction when g varies from 0.01 to 1 and when C increases in the value.In the other hand, the result from Table 2 shows in different from Table 1.It is clear that, when the variation g increases in the value from 0.01 to 1, both percentage of corrects prediction for feature 1 and feature trends to decrease.While when the variation d varies from 3 to 10, the accuracy of prediction is increasing.This exhibits SVM good generalization performance.
The results reported here have shown that the performance of SVM with RBF kernel is better than SVM with polynomial kernel.We use SVM (RBF kernel) with changing C and g to simulate and to classify children with and without handwriting problem based on drawing tasks.

Conclusions
SVM RBF and polynomial have been used in this study to select those who are at risk of handwriting difficulty due to the improper use of graphic rules.Cross-validation method is adopted to choose parameter in order to gain preferable classificatory result.In this paper, we have testified that the performance of SVM with RBF kernel is better than the one with polynomial kernel.Experiment simulative results indicate: average accuracy of classificatory testing based on SVM RBF algorithm reaches more than 93%.The data is apparently high compared with SVM polynomial algorithm.