Research and Implementation of a License Plate Recognition Algorithm Based on Hierarchical Classification

This paper proposed an improved method for license plate recognition based on hierarchical classification. First, the method of feature extraction and dimension reduction is presented by finding the optimal wavelet packet basis in the process of wavelet packet decomposition and K-L transform. Then the recognition algorithm is introduced based on feature extraction and hierarchical classification. Finally, the principles and procedures of using support vector machines, Harris corner detection algorithm and digital character classification are explained in detail. Simulation results indicate that the presented recognition algorithm performs well with higher speed and efficiency in recognition.


Introduction
License plate recognition (LPR) system is an important component of the intelligent transportation systems in the modern society.Being more and more popular, LPR is the hot issue in the area of image processing, computer vision and pattern recognition.The steps of License plate recognition include image acquisition, image digital processing, license plate location and extraction, character segmentation and character recognition.Character recognition algorithm is the significant part.
So far, there have been many methods of LPR, such as template matching, feature matching, artificial neural networks, support vector machines (SVM), et al.In the so many techniques, SVM is favored.SVM takes advantages of generalization and simple structure.SVM also has the outstanding effect in solving issues of small sample, nonlinear and high-dimension space [1].However, due to the limitations of low decision-making speed and classification accuracy, the conventional SVM method has a worse efficiency.In addition, the more feature dimensions, the more difficult to complete the recognition.Therefore, academics have made improvement continuously in speed and accuracy to optimize recognition performance.However, these methods which based on the more complicated mathematical and structural model lead to increasing complexity.Literatures [2] and [3] use hidden Markov to do recognition, but the probability model needs lots of prior knowledge.Literature [4] presents a tree of SVM decision based on prior knowledge with one certain character being used frequently.The system is relatively limited, and the recognition efficiency is not satisfied when classifiers are too much in the case of one-to-one.
Based on the above considerations, this paper proposes an improved license plate recognition algorithm based on hierarchical classification.Through the rational application of wavelet packet decomposition and SVM, the algorithm has rational analysis and allocation for the layers of recognition decision.Test results show that the improved algorithm based on hierarchical classification takes advantages of high speed and accuracy, and performs well with high efficiency and robustness.

Feature Extraction
Distributed characteristics of characters' structure in the four spaces of wavelet decomposition can mainly reflect the structure characteristics of Chinese, numbers and letters, which can be used as basis for license plate cha-racter recognition.In Figure 1(a), only the low-frequency part of the signal will be decomposed in the conventional orthogonal wavelet decomposition.When all the wavelet coefficients are used as features, the decomposition has lots of dimensions and low flexibility.shows that wavelet packet decomposition can make continuous decomposition in the wavelet spaces of high-frequency part of signal so as to obtain abundant information in time-frequency domain.First, in order to overcome the shortcomings of wavelet transform, this paper uses wavelet packet decomposition which has better characteristics in time-frequency domain to finish feature extraction.Then find the optimal wavelet packet basis in the process of wavelet packet decomposition in order to reduce the dimension of feature vectors.At last, the coefficients of the optimal basis will be reduced in dimension by K-L transform.So feature is extracted.

Principle of Wavelet Packet Transform
From the perspective of multi-resolution analysis, wavelet transform decomposes space L 2 (R) into direct sum of all the subspaces W j (j∈Z) based on different scale factor j, that is 2 ϕ , there are follow- ing recurrence relations.
Where k h and k g are filter coefficients, and { } n ψ is defined as wavelet packet on the basis of orthogonal scaling function ) (t φ .Equation (1) shows that the wavelet de- composition is a special situation of wavelet packet decomposition.

Find the Optimal Wavelet Packet Basis
Based on a certain standard, select the optimal wavelet packet basis from the collection of wavelet packet basis.The wavelet packet decomposition coefficients of the optimal basis will be used as identification feature.The essential aspect is how to find the optimal wavelet packet basis.
According to the literature [5], there follows the basic criterion.The mean vectors of two categories i ω and j ω are i m and j m with the within-class scatter S ω .
( , ) 1 where F J is defined as the criterion value, and is the range of ( , ) α is defined as the ratio of ( ) α > indicates that the child nodes are more useful to classification, otherwise the parent nodes are more useful.
( ( ) 1) Nodes meets the conditions of decomposition when 0 α > and nodes of ( ) 1 k l n α > are not less than 2. Combined the parameters with conditions, the algorithm of feature extraction is following [6].
1) The image of license plate character is normalized to size of 32 × 16.
2) The character image decomposed by wavelet packet decomposition in there levels and find F J of each node.3) Find the optimal wavelet packet basis Z.If the node k l U does not meet the conditions of decomposition, the child node ( ) U n will be verified whether the node meets the conditions.When node ( ) U n does not meet the conditions, k l U will join in the optimal wavelet packet basis Z.If ( ) U n meets the conditions and the nodes of ( ) U n will join in Z and delete 4) Repeat step 3 until each node of Z is checked.Then the optimal wavelet packet basis can be obtained with their coefficients being recognized characters.

Use K-L Transform to Get Eigenvectors
The dimension of coefficients of the optimal wavelet packet basis is large, which is a disadvantage of improving recognition rate and speed.It is indispensable to reduce dimension of feature vectors, and optimize recognition effect through reducing redundant information.K-L transform means the method for analysis of principal component.The purpose is to remove the correlation between the data, as well as transform and project the data from the original R-dimension space to the M-dimension space (R >> M) with the minimum distortion based on the criterion of mean square error [7].
Assume that ， is a sample set (where ), M is the total number of training samples, and N is the dimension of each training sample.There is a matrix based on total scatter matrix of the training sample set X in Equation ( 3): are solved and ranged, that is . The corresponding eigenvectors in the sequence are ( 1, 2, ) Taking the first m maximum, a matrix U of K-L transform concludes the corresponding m eigenvectors, that is . The value of m can be determined trough the quotient of sum of the largest engen values to sum of total eifenvalues, that is = ∑ ∑ .Take θ as large as possi- ble, while m as small as possible.The situation can keep image information to the greatest extent with minimum dimension of the feature vectors.According to the simulation and experiment, the variation of θ with m is obtained in Figure 2. When the value of m is about 30, the feature vectors in low-dimension space can reflect the situation of the characteristics in the original highdimension space.So this paper selects m to be 30 [8].
Actually, feature extraction is a process of extracting the decomposed coefficients of the optimal wavelet packet basis through wavelet packet decomposition in three levels of the identified character.In the extraction, not only does the high-frequency component information increase to improve the recognition rate, but also K-L transform reduce the dimension to improve recognition rate, so as to optimize recognition effects.

Recognition Algorithm Based on Hierarchical Classification
Figure 3 shows the recognition algorithm based on hierarchical classification which mainly includes feature extraction and character recognition.In the process of feature extraction, the pretreated characters decomposed by wavelet packet decomposition, and find the optimal wavelet packet basis.Then the wavelet coefficients of the optimal wavelet packet basis are reduced in dimension by K-L transform.So there are the eigenvectors to be

Initial Recognition Using SVM
As the youngest member of statistical theory, SVM takes advantage of the principle of minimum structural risk, performs a particular merit in solving problem of small sample, nonlinear and high-dimension space and exhibits a tremendous favorable prospect [9].To linear problems, the kernel function is a dot product of two vectors.As to nonlinear problems, SVM defines nonlinear mappings which will project input vectors from a lower-dimension space to a higher one.Then an optimal hyperplane is constructed in this high-dimension space.Not only does the optimal hyperplane separate the two types properly, but also the interval between types reaches to the maximum.As the important concept of SVM, maximize the interval is to control the capability of generalization [10].
Actually, the structural problem of the optimal hyperplane is to solve a quadratic optimization problem with a specific constraint.The optimal decision function is presented in Equation ( 4) [11].
1 ( ) sgn ( , ') where sgn (•) is a sign function, L is the number of training samples, and 0 ≥ i a is defined as the Lagrange multiplier and K (•, •) is a kernel function.

Choose a Kernel Function
Generally, kernel functions are defined as ( , ') ( ) ( ') , where x and ' x are vectors in low-dimension space, while ( ) x ϕ and ( ') x ϕ are the transformed vectors.One form called radial basis kernel function is forceful in locality which is similar to the characteristics of human visual [12].Taking into the factors, the form of radial basis function (the other name is RBF or Guass kernel function) is chosen to be trained.RBF can be shown as: (5)

Train and Recognize Vectors Using LIBSVM
LIBSVM is an integrated software developed by Professor Chih-Jen Lin from Taiwan.LIBSVM is used in various fields such as pattern recognition, regression analysis, and estimation of probability distribution [13].The application consists mainly of interface functions such as svm-scale, svm-train, svm-predict and so on.Training and recognition based on LIBSVM involves the following steps [14]. 1) Training samples of the character are extracted and converted to the required format.In order to facilitate the calculation and prevent one feature to be too large or too small, the function of svm-scale can scale data to the proper range, generally, [0,1] or [1,1].
2) Radial basis function is used as the kernel function.
3) Use cross-validation method to train test samples repeatedly and find the optimal parameters.The so-called cross-validation is to divide data into k sets.When one set participates in training, the other k-1 sets take the test to obtain k parameters (C and ξ).Compared with the other results, the one set of parameters which have the best effect is elected as the optimal parameters.4) Use the optimal parameters and RBF to train the training sets to get SVM model.5) Do license plate characters recognition taking advantage of the above-mentioned model.

Recognize Confusing Characters Using Harris corner Detection
Due to the similar outline and interference by substantive reasons, confusing characters such as {2, Z}, {0, D, Q, C, U}, {8, B, R}, {5, 8, S}, {4, A} and {C, G} will affect the recognition accuracy of the final results.So a further identify to similar characters is a pivotal process which affects the recognition results.Because of scale invariance and rotation which can inhibit light and noise impacts, Harris corner detection algorithm is a good choice to have a further recognition to similar characters.
Harris corner detection is presented on the basis of Moravec operator by Harris.C. G and Stephen.M. J in 1988.Using a first-order partial derivatives to describe the gray change, Harris corner detection has a matrix M associated with autocorrelation function.
where I (x, y) is the brightness value expressed by gray scale, and I X ∂ ∂ is the gradient in the X direction of the image I (x, y), while I Y ∂ ∂ is the gradient in the Y direction.The two eigenvalues of the matrix M which are first-order curvatures of autocorrelation can be used to judge the different areas in the image.If both are large, the point is regarded as a corner; if one of the eigenvalues is large and another one is small, the point is located at the edge; if both are small, it is determined in a flat region [15].On the basis of the matrix M, corner response function (CRF) is defined as: where k sets 0.04 which is optimal parameter [16], det is the determinant of M, being small at the edges and large at the corner points, and trace is the trace of M, being large at the corner points.Therefore, the local maximum of CRF is the corner point.Simulation results show that Harris corner detection on similar characters recognition has excellent effect.For instance, D, U and 0 of approximate outline can be seen in Figure 4.The number and position of Harris corner clearly distinguished between each other.Experiments on plates containing confusing characters found that the Harris corner detection can be used to resolute similar characters with good results, and the recognition rate can be improved effectively.

Digital Classification Algorithm
The plate contains Chinese characters, numbers and letters in China.Character samples can be classified into Chinese network, digital network, alphabet network and alphanumeric network.According to the "Vehicle license of the people's Republic of China" revised in 2007, the last five characters in the seven ones are called serial numbers, which have three kinds of encoding rules: a) each of serial numbers can use Arabic numerals; b) each of serial numbers can uses letters alone, but O or I cannot be used from the 26 letters; c) it is allowed that two letters can be existed in serial numbers, and O or I cannot be used from the 26 letters.
As there are no more than two letters, it can be identified by setting the number of detected letters.The algorithm of digital classification is presented in Figure 5.
In the recognition of the last five characters, if there are two letters been detected in the serial numbers, following characters which are prepared to be identified can be input the digital network, instead of alphanumeric network.For instance, the vehicle license shown in Figure 6 can be recognized with the digital character classification algorithm.A large number of experiments indicate that digital character classification algorithm can improve the recognition speed; because the samples are restricted to some degree, the recognition accuracy is improved as well.

Experimental Results and Achievement of the System
In this experiment, the vehicle images are 24 true-color real-shots.License plate characters contain digital samples of 10, letter samples of 24, and Chinese character samples of 55.In training samples, there are letter samples of 120, digital samples of 32 and Chinese character samples of 166.The recognition applies LIBSVM library using C-SVC as model and RBF as kernel function.
Cross-validation K sets 5, and penalty parameter C sets 8.The recognition results and comparison are proposed in Tables 1 and 2. It can be noticed that the proposed algorithm has advantages of high recognition rate and speed over other identification methods such as artificial neural networks and traditional support vector machine.The system has an outstanding performance.License plate recognition based on hierarchical classification is simulated by Matlab 7.12 (R2011a), designed by Visual C + + 6.0 on C language , proceeded by MFC to build software platform and transplanted to the license plate recognition system with ARM9-S3C3440A as the master chip.The implemental processes of the system are demonstrated in Figure 6.

Conclusion
This paper proposed an advanced license plate recognition system based on an improved hierarchical classification algorithm on the basis of wavelet packet decomposition and support vector machines.As the significant part of recognition, the improved algorithm provides an efficient and easy-going way.This paper presented a multi-level model based on classification which is different from other recognition algorithms.In feature extraction, find optimal wavelet packet basis in the process of wavelet packet decomposition to obtain eigenvetors.Then the dimension of eigenvectors is reduced by K-L transform to optimize the performance.In recognition, the characters are identified by SVM.Harris corner detection is used to correct confusing characters.Finally, the method is optimized in recognition speed by the algorithm of digital classification decision.Experimental results indicate that, compared with systems based on other recognition algorithms, the performance of this presented PLR system is developed effectively and the system enhances a better stability, robustness and efficiency.

Figure 1 .
Figure 1.Wavelet decomposition and wavelet packet decomposition.(a) Structure diagram of wavelet decomposition; (b) Structure diagram of wavelet packet decomposition in 3 levels.
decomposed by wavelet packet basis into levels of N. k l U (l = 1, …, N; k = 1, …, 2 N ) means the k-th node in the layer l.The child note of


. A correlation matrix R can be expressed with T R AA = on the basis of generated matrix A. Eigen values of R which can be expressed with i (1 ) i N λ ≤ ≤

Figure 3 .
Figure 3. License plate recognition algorithm based on hierarchical classification.recognized.As to the design of character recognition classifier, support vector machines can be used as the first identification.Next, confusing pictographic characters are secondarily classified taking advantage of Harris corner detection algorithm.Ultimately, all the digits and letters are classified according to the feature of arrangement of the license plate characters.The improved PLR algorithm based on hierarchical classification can effectively improve the accuracy and recognized speed, and the effect is impressive to confusing characters.

Figure 4 .
Figure 4. Harris corner diagram of D, U and 0.

Figure 6 .
Figure 6.Implement of the license plate recognition system based on hierarchical classification.(a) Designed by MFC; (b) PLR system on ARM9.