The Research of Examination Paper Generation Based on Index System Metrics and Multi-Objective Strategy

Since the examination paper generated with computer by the algorithms of random and backtracking takes on inferior quality and inefficient, and the question of generating examination paper with computer has the character of multi-objective because of the index system metrics, the genetic algorithm with multi-objective strategy optimization is proposed to solve this problem. Mapping the index system to multi-objective functions and optimizing the computing with multi-objective strategy are employed in the algorithm. The genetic algorithm experiment based on the multi-objective strategy optimization shows that the result has the advantages getting tradeoff between performance and quality, and having the ability to tune the performance and quality to meet the user’s requirements.


Introduction
The technology of generating examination paper with computer is an import part for modern education.This technology is the infrastructure of others application such as intelligent tutoring systems [1], distance learning systems [2], computerized adaptive testing systems [3], e-learning systems [4], etc.In recent years, this technology is focused by many researchers on the algorithm how to search and compose a paper [5][6][7][8][9].
Generating examination paper with computer is defined as searching items from item bank and composing it following the paper index system.The process of generating examination paper with computer has the character of huge searching space, complex computing, and multi-objective [5].Many algorithms are proposed to solve this problem, such as random algorithm, backtracking algorithm, and intelligent algorithm [6,7].
The random algorithm is searching the item bank and composing a paper with the random searching process.This is an efficient way to get a paper with computer, but this algorithm can't guarantee the quality.The backtracking algorithm is searching all possible combination of items in item bank and composing a paper with the best result.This algorithm can get the optimization result, but can't get a result when the item bank holding numbers of items.Intelligent algorithm employs the process of biological evolution to compose a paper.The representative algorithms of intelligent algorithm are genetic algorithm, artificial fish swarm algorithm, etc.
In this thesis, the genetic algorithm with multi-objective optimization is proposed to solve this problem.Firstly, the index system is mapped to multi-objective functions.Secondly, multi-objective strategy is used to optimize the computing process.By experiment, the proposed algorithm can control the scale of computing and get tradeoff result between performance and quality.

Examination Paper Model
Examination paper is regarded as a collection of items, so paper P and item bank B denotes as matrixes of m × n and s × n: The parameter m of matrix P denotes the items' number of paper P, and the parameter s of matrix B denotes the items' number of item bank B. The parameter n of matrix P and matrix B denotes n objects of index system, and each column of matrixes denotes one object of index system.To generate the optimization result matrix P (i.e.paper P), the condition m ≤ s must be satisfied.
To generate paper P, a temporary paper S must be created from the item bank B. The paper S has the same structure as the paper P which includes m rows and n columns.The item bank B can generate numbers of S from the huge solution space

Index System Metrics and Multi-Objective Optimization
The index system outlines the objects (i.e. the constraints) of generating examination paper with computer.Generally, the index system includes several objects, such as examination paper rubric number object, examination paper rubric type object, examination paper knowledge object, examination paper difficulty distribution object, examination paper rubric reference times object, etc.All of these objects can be regarded as multi-objective, and each object can reason an object function to measure the degree of approximation from S to P.
From above, the concrete paper matrix P, S and item bank B can be defined as: The matrix P and matrix S contains both the same rows and columns, and the parameter m denotes m items.Matrix B contains s rows and 4 columns.All the matrixes include 4 columns, the first column denotes the rubric type index, the second column denotes the rubric knowledge index, the third column denotes difficulty index, and the forth column denotes reference times.These indexes, such as rubric type index, rubric knowledge index, etc, are integers the corresponding item's attribute identity.
To characterize the rubric type of paper P, S and item bank B, vector T p , T s and T b are defined as: The index of vector T p , T s and T b denotes the identity of rubric type, for example, t 1 denotes rubric type number 1.The elements of vector T p , T s and T b are statistics from the first column of matrix P, S and B.
To measure the rubric type approximation degree of paper P and S, the vector distance F t between matrix P and S is defined as: F t is a vector distance between vector T s and vector T p , is the i'th element of vector T s , T p and T b respectively.b i t F t can measure the rubric type approximation degree between paper P and paper S, because the vector distance between vector T p and vector T s varies smaller when rubric type between paper P and paper S becomes more similar.In particular, the value of F t is 0 when rubric type of paper P and paper S is the same.
To characterize the knowledge distribution of paper P and S, collection K p and K s is defined as: , , The collection K p contains p elements coming from the statistics of 2 , correspondently, the collection K s contains s elements coming from the statistics of .
To measure the knowledge distribution approximation degree of paper P and S, the collection operation is defined as: It should be noted that the operator " x " means getting the count of elements in collection x, so F t is the sum of different elements between matrix P and S. The smaller F t is, more similar the knowledge distribution between paper P and S will be, and the best F t is 0.
To characterize the rubric difficulty distribution of paper P and S, vector D p and D s is defined as: The vector D p and D s has the same semantic as the vector T p , T s and T b , the index of vector D p and D s denotes the identity of rubric difficulty.The elements of vector D p and D s is the statistics of the third column of matrix P, matrix S respectively.
To measure the rubric difficulty distribution approximation degree between paper P and S, the vector distance F d is defined as: F d is the vector distance between vector D s and D p , s i d and p i d is the i'th element of vector D s , vector D p respectively.
F d can also measure the rubric difficulty approximation degree between paper P and S, F d will vary smaller when the rubric difficulty distribution of paper S becomes more similar as paper P's, F d is 0 when the rubric difficulty distribution of paper S is the same as paper P's.
To characterize the rubric reference times between paper P and S, R p the sum of vector 4 is defined, and R s the sum of vector is defined too.
To measure the reference approximation degree between paper P and S, the distance F r is defined as: Generally, R p is very small, such as 0. Formulae ( 1)-( 4) define the object functions according to the index system metrics, and the following conclusion can be draw from these formulae: The range of The weightiness of F t , F k , F d and F r decreases orderly The optimum solution is According to above conclusion, this problem is a multi-objective optimization problem.It can be solved by the multi-objective optimization method.To solve this problem, function vector F and weight vector W are defined as: The vector F is a function vector composed of F t , F k , F d and F r , and the weight vector W is the corresponding weight of vector F's elements.In practice, the value of w t , w k , w d and w r decreases orderly with the weightiness of F t , F k , F d and F r decreasing orderly.
To optimize the multi-objective problem, the multiobjective function F 0 is defined as: The approximation degree between paper P and paper S can be estimated by the value of F 0 , because the value of F 0 is smaller as paper P and paper S is more alike.From formula (5), the best value of F 0 is 0.

Genetic Algorithm for Multi-Objective Optimization
To optimize this multi-objective problem with genetic algorithm [5,9], the coding scheme, selection operation, crossover operation and mutation operation should be established firstly.In our experiment, binary coding scheme is used, the binary vector X s is defined as: X s is a vector whose elements denote the corresponding item whether or not contained in paper S, and the parameter m is the items' count of paper S.
To establish the selection operation, the multi-objecttive function F 0 is employed as the fitness function and the proportion is adopted to form the individual selection probability.The selection probability P is is defined as: is the total fitness in the population of one gen- eration, and i o F is the fitness of the ith individual in the population of one generation.
The crossover operation is defined as half of the gene of individual participates in crossover, and the remainder keeps the same.This rule is employed to generate two children from two parents.The mutation operator changes the gene's value of all individuals according to the defined mutation probability P m .

Experiments and Evaluation
To evaluation the performance of the proposed algorithm, two experiments have been conducted to compare performance and quality between random algorithm and genetic algorithm based on the multi-objective strategy optimization.The first experiment is to use random algorithm to generate several papers and the second one is to use the genetic algorithm to generate numbers of paper.Both of them can be evaluated by the Formula (5).
To perform the experiment, an item bank is set up by create 10,000 items randomly.The item bank can be used by the first and the second experiment.The random algorithm's parameters are listed in Table 1, the multi-objective's parameters are listed in Table 2, and the genetic algorithm's parameters are listed in Table 3.
Figure 1 shows the generated papers by random algorithm, and Figure 2 shows the generated papers by genetic algorithm.From the two figures, it can be seen that the random algorithm can generate a paper quickly but the fitness hardly to approximate to 0. By contrary, the genetic algorithm with the multi-objective optimization can get the best fitness 0 in generations.
From Figure 2, it can be seen that the dot curve of best fitness shows a downward trend from generation 0 to 43 and the best fitness 0 in generation 43.The user can     choose the generated paper from generation 15 to 43, and get tradeoff between performance and quality.

Conclusion and Future Work
In this paper, multi-objective strategy with genetic algorithm is proposed to solve the problem generating examination paper with computer.The object functions are defined according to the index system metrics, and the multi-objective function is defined from the object functions.To generate a paper meeting the index system metrics with genetic algorithm, the multi-objective function is set to be the fitness function.From the experiments, the multi-objective strategy with genetic algorithm to generate examination paper has the advantage getting tradeoff result between performance and quality and having the ability to tune the performance and quality.
For further work, it should pay more attention to the constringency grads of the multi-objective function.