Genetic Algorithm Works for Vectoring Image Outlines of Generic Shapes

This work proposes a scheme which helps digitizing hand printed and electronic planar objects or vectorizing the generic shapes. An evolutionary optimization technique namely Genetic Algorithm (GA) is used to solve the problem of curve fitting with a cubic spline function. GA works well for finding the optimal values of shape parameters in the description of the proposed cubic spline. The underlying scheme comprises of various phases including data of the image outlines, detection of corner points, using GA for optimal values of shape parameters, and fitting curve using cubic spline to the detected corner points.


Introduction
Fitting curves to the data extracted from generic planar shapes is the problem which is immensely worked on during last two decades.It still grabs the attention of researchers due to its applications in diverse fields and its demands in the industry.The process of vectorizing outlines of the images consists of several mathematical and computational phases and stages.This process aims to fit an optimal curve to the data extracted from the boundary of the image.Although many contributions in the literature  can be found in this area, there is still room for making more advancements and finding interactive approaches.
Least square fitting is common in optimization problems in which splines and higher order polynomials are used to approximate the data.One can see a cubic spline technique [1] with least square fitting.Squared distance minimization has been used on B-spline curves in [2].It uses iterative process to achieve an optimal curve.Instead of parametric form, implicit form of the polynomial is also used for this purpose.Implicit B-spline curves [3] are used to solve curve reconstruction problem by approximating the point clouds.It uses the heuristic of trust region algorithm.In [4,5], schemes were proposed for fitting implicitly defined algebraic spline curves and surfaces.This was achieved over the scattered data by simultaneously approximating points and associated normal vectors.
In this paper, a soft computing technique namely Genetic Algorithm (GA) [6] is proposed to find the optimal spline curves to the data extracted from the boundaries of the generic images.This evolutionary technique incorporates the corner points from the outline of the input image.The detection of corner points is quite significant as it helps minimizing the time to achieve desired curve to the outline of the image.Curve fitting in this scheme is done by using cubic spline which contains two shape parameters in its description.Basic target is to find those values of the parameters which assure minimum error between detected boundary of the image and the fitted spline curve.
The paper is organized in a way that the first and second steps (outline estimation and corner detection) of the proposed scheme are described in Section 2. A generalized cubic spline curve scheme is given in Section 3. Genetic Algorithm is explained in Section 4, Section 5 discusses the proposed scheme which is demonstrated with examples in Section 6.Finally, the paper is concluded in Section 7.

Countour Extraction and Segmentation
First step in proposed scheme of vectorization of planar objects is to extract data from the boundary of the bitmap image or a generic shape.In this procedure, a bitmap image of the generic shape is used as an input.In order to get the image, software like Paint and Adobe Photoshop can be used or some other appropriate way can be adopted.After saving the bitmap image to the system, the chain code method [7,8] is used to extract boundary of the image.Chain codes represent the direction of the image and help to attain the geometric data from outline of the image.
In the next step, the data extracted from the outline needs to be subdivided into smaller segments for curve fitting.For this purpose corner points or significant points are detected.Detection of these points is not an easy task as exactness of detected corners can only be judged by human eye and no other standard criterion exist.Then accuracy of any corner detection scheme can only be examined if the original corner positions are known.Generally corner detection can be defined as an approach which extracts the dominating features of an image and consequently helps deducing contents of the image.Plenty of corner detection schemes can be found in the literature [9][10][11][12].In this paper, the scheme presented in [10] is used to divide the boundary into smaller segments.Each segment of the boundary consists of two consecutive corner points and the data points in between them.These corner points would be used for curve fitting.

Generalized Cubic Spline
Finding corner leads to subdivision of the data obtained by the boundary of the bitmap image into pieces.Each piece consists of two successive corner points and the data points in between them.Thus if there are m corner points F 1 , …, F m then there will bem pieces P 1 , …, P m .Each piece is treated separately and spline is fitted to it.
First piece consists of all the contour points in between F 1 and F 2 inclusive.Second piece contains all contour points in between F 2 and F 2 inclusive.Consequently, the mthpiece includes all contour points between F m and F 1 inclusive.In general, the ith piece contains all the data points between F i and F i+1 inclusive.

Cubic Spline Interpolant
As a curve fitting technique, the algorithm proposed in Section 5 makes use of a generalized cubic spline method.This spline embodies a number of desirable features needed for an optimum solution.The curve-fitting method employed here seeks the cubic spline for the determination of good shape parameters in its description.
Cubic spline function [13] is used for fitting curves at corner points.Let F i , F i+1 , iZ be the two corner points of ith piece.Also let D i and D i+1 be the corresponding tangents at corner points.Then the cubic function, where and are shape parameters, is defined by: Equation ( 1) can be rewritten as: where .
The functions R j,i , j = 0, 1, 2, 3 are Bernstein Bézier like basis functions, such that The cubic function (1) has the following properties: Figure 1 representscurve fitting to the given data by using cubic function (1) for assigning the different values to the parameters ν i and i .The effect of different values of the shape parameters on the shape of the curve are also shown in Figure 1.In Figure 1(a) cubic curve ( 1) is fitted to the data with the values of parameters as:

Parameterization
Number of parameterization techniques can be found in literature for instance uniform parameterization, linear or chord length parameterization, parabolic parameterization and cubic parameterization.In this paper, chord length parameterization is used to estimate the parametric value t associated with each point.It is as follows: It can be observed that t i is in normalized form and varies from 0 to 1. Consequently, in our case, h i is always equal to 1.

Estimation of Tangent Vectors
A distance based choice of tangent vectors D i 's at F i 's is defined as: For open curves: 1, 2,..., 1.

Genetic Algorithm
Genetic Algorithms (GAs) are the evolution based search techniques.In GAs, every solution, in a given well-defined search space, is represented by a bit string.This bit string is called a chromosome.Selection, crossover and mutation are the three operators used in agenetic algorithm.A GA creates a population of chromosomes iteratively and is attempted to improve on the quality of chromosomes.
A GA allows a populationcomposed of many individuals to evolve under specified selection rules toa state that maximizes the "fitness" (i.e., minimizes the cost function).A set of input variable, in the form of a chromosome solution, is represented in a well-defined search space.A cost function, which may be a game, or an experiment or a mathematical function, is used to generate an optimal output from the chromosome.
The GA begins by defining a chromosome or an array of variable values to be optimized.The variable values are represented in binary form, so the binary GA works with bits.However, the cost function normally needs continuous variable to use in its description.Therefore, the chromosome is decoded whenever the cost function is evaluated.
How, a chromosome is encoded in binary for, is shown in Figure 2.
The GA starts with a group of chromosomes known as the population.Next the variables are passed to the cost Thresholding is another approach to the process of natural selection.All the chromosomes having a cost value less than some threshold are assumed to be survived in this approach.In order that parents produce offspring, the threshold allows some of the chromosomes to continue.Otherwise, to find some chromosomes that pass the test, there would be the case that the whole new population would be generated.In the whole process, in the beginning, a small number of chromosomes may survive.However, in the generations afterwards, most of the chromosomes will survive provided the threshold is not changed.
In process of matchmaking, two chromosomes are selected from the mating pool of survived chromosomes to produce two new offspring.There are several schemes for parent selection like roulette wheel, tournament selection, random pairing etc.The next step after selecting parents is mating to create one or more offspring.
The crossover operator is a commonly used form of mating.It deals with two parents to produce two offspring.The first and the last bits of the parent's chromosomes are used to randomly select a crossover point.The left of the crossover point to the first offspring is passed the binary code of the first parent.In the same way, the left of the crossover point to the second offspring is passed the binary code of the second parent.Moreover, the binary code to the right of the crossover point of first parent goes to second offspring and second parent passes its right side's code to first offspring.As a result of crossover operator the offspring contain parts of both the parents.Crossover operator is demonstrated in Figure 3.
Another way of creating new chromosomes is mutation in which new traits can be introduced to chromosomes that are not present in the original population.A single point mutation changes a 1 to a 0, and vice versa is shown in Figure 4.
The process of GA described is iterated and would be repeated until the achievement of best solution for the problem.Flowchart of GA is shown in Figure 5.

Proposed Approach
In this Section the proposed scheme to the curve fitting problem is described.It includes the phases of problem matching with Genetic Algorithm using cubic spline function, description of parameters used for GA and curve fitting.

Problem Mapping
In this section Genetic Algorithm formulation of the pro-  blem discussed in this paper is described in detail.
Suppose, for i = 0, 1, …, n − 1, the data segments P i,j = (X i,j , Y i,j ,), j = 1, 2, …, m are given as ordered sets of the universal set of data points.Then the squared sums S i 's of distance between P i.j 's and their corresponding parametric points P(t i )'s on the curve are determined as where u's are parameterized in reference to chord length parameterization.For the best fitting of the curve to given data, such values of parameter ν i and i , are required so that the sums S i 's are minimal.Genetic Algorithm is used to optimize this value for the fitted curve.We start with initial population of values of ν i and i chosen randomly.Successive application of search operations to this population leads to optimal values of ν i and .  1 gives number of contour points and initial corner points of the images.

Curve Fitting
Detection of the corner points leads towards the subdivision of the boundary of the image into segments.The interpolation spline of Section 3.1 is then used to approximate each segment of the boundary.This spline has the parameters v and w in its description.The initial solution of the parameters v and w is randomly selected.After an initial approximation for the segment is obtained, The GA is run to get the optimal solution of v and w.Genetic Algorithm helps to obtain better approximations to achieve optimal solution.The tangent vectors at knots are estimated by the method described in Section 3.3.

Breaking Segment
For some segments, the best fit obtained through iterative improvement may not be satisfactory.In that case, we subdivide the segment into smaller segments at points where the distance between the boundary and parametric curve exceeds some predefined threshold; such points are termed as intermediate points.A new parametric curve is fitted for each new segment as shown in Figures 6(e    In Table 2, number of intermediate points is presented which is obtained while fitting the optimized cubic spline for different iterations of GA.

Demonstration
Curve fitting scheme, proposed in Section 5 has been implemented on different images.In  demonstrates corner points (d), (e) and (f) give fitted outline for 1st , 2nd and final iterations for threshold 3 respectively using Genetic Algorithms together with corner points and intermediate points).Figures 7 and 8 can also be described in similar fashion.Time elapsed for applying the proposed scheme for different images is given in Table 2.In Figure 14, stopping criteria followed to run GA is given and in Figure 15

Conclusion
In this paper a scheme is presented which vectorizes the generic shapes.A cubic function is used for curve fitting and a soft computing technique genetic algorithm is used Copyright © 2013 SciRes.JSEA       to find optimal values of the parameters in the description of the cubic function.The method proposed starts with initial random population of parameters and finds those values of the parameters which can assure best optimal curve to the data extracted by bitmap images.The scheme presented is automatic and no human intercession is required.It also ensures computational efficiency as far as curve fitting is concerned.

Figure 1 .
Figure 1.Demonstration of cubic function (1) for different values of parameters.
Once we have the bitmap image shown in Figures 6(a), 7(a) and 8(a), the method of Section 2 is used to extract the boundary of the image.The boundary of the image is then used to detect the corner points in the next phase.It uses the corner detection method pointed out in Section 2. Figures 6(b) and 6(c), Figures 7(b) and 7(c) and Figures 8(b) and 8(c), show boundary of the bitmap images and detected corner points respectively.Table )

Figure 6 .
Figure 6.Image of Plane with detected corner points and fitted cubic curve.(a) Bitmap image; (b) Boundary extracted; (c) Corners detected; (d) Cubic curve interpolated to corner points for 1st iteration of GA; (e) Cubic curve interpolated to corner points for 2nd iteration of GA with breakpoints; (f) Cubic curve interpolated to corner points for final (5th) iteration of GA with breakpoints.

Figure 6 ,Figure 7 .
Figure 7. Image of fork with detected corner points and fitted cubic curve.(a) Bitmap image; (b) Boundary extracted; (c) Corners detected; (d) Cubic curve interpolated to corner points for 1st iteration of GA; (e) Cubic curve interpolated to corner points for 2nd iteration of GA with breakpoints; (f) Cubic curve interpolated to corner points for final (4th) iteration of GA with breakpoints.

Figure 8 .Figures 9 -
Figure 8. Image of Fish with detected corner points and fitted cubic curve.(a) Bitmap image; (b) Boundary extracted; (c) Corners detected; (d) Cubic curve interpolated to corner points for 1st iteration of GA; (e) Cubic curve interpolated to corner points for 2nd iteration of GA with breakpoints; (f) Cubic curve interpolated to corner points for final (4th) iteration of GA with breakpoints.Figures 9-13 show behaviors of fitness function for the image of fish on running GA again and again.It can be observed in Figure 9 that minimum value of cost function is achieved after iteration 20, whereas Figure 10 and Figure 11 indicate that minimum fitness function is obtained at iteration 10 and iteration 5 respectively.While Figures 12 and 13 depict a bit different behavior as in these cases initially fitness function increases and then it starts decreasing.In Figure14, stopping criteria followed to run GA is given and in Figure15best (^), worst (o) and mean (*) values of objective functions are shown in each iteration for the image of fish.Flowchart for proposed algorithm is given in Figure16.
Figures 9-13 show behaviors of fitness function for the image of fish on running GA again and again.It can be observed in Figure 9 that minimum value of cost function is achieved after iteration 20, whereas Figure 10 and Figure 11 indicate that minimum fitness function is obtained at iteration 10 and iteration 5 respectively.While Figures 12 and 13 depict a bit different behavior as in these cases initially fitness function increases and then it starts decreasing.In Figure14, stopping criteria followed to run GA is given and in Figure15best (^), worst (o) and mean (*) values of objective functions are shown in each iteration for the image of fish.Flowchart for proposed algorithm is given in Figure16.

Figure 12 .
Figure 12.Mix behavior of fitness function.

Figure 13 .
Figure 13.Increasing and decreasing fitness function.

Figure 14 .
Figure 14.Stopping criteria met by GA in %.