A Weight-coded Evolutionary Algorithm for the Multidimensional Knapsack Problem

A revised weight-coded evolutionary algorithm (RWCEA) is proposed for solving multidimensional knapsack problems. This RWCEA uses a new decoding method and incorporates a heuristic method in initialization. Computational results show that the RWCEA performs better than a weight-coded evolutionary algorithm proposed by Raidl (1999) and to some existing benchmarks, it can yield better results than the ones reported in the OR-library.

Each of the m constraints described in (1b) is called a knapsack constraint.
A set of n items with profits p j > 0 and m resources with b i > 0 are given.Each item j consumes an amount r ij ≥ 0 from each resource i.The 0-1 decision variables x j indicate which items are selected.A well-stated MKP also assumes that r ij ≤ b i < n j=1 r ij and p j > 0 for all i ∈ I = {1, . . ., m}, j ∈ J = {1, . . ., n}, since any violation of these conditions will result in some constraints being eliminated or some x j 's being fixed.
The MKP degenerates to the knapsack problem when m = 1 in Eq. (1b).It is well known that the knapsack problem is not a strong N P-hard problem and solvable in pseudo-polynomial time.However, the situation is different to the general case of m > 1. Garey and Johnson (1979) [1] proved that it is strongly N P-hard and exact techniques are in practice only applicable to instances of small to moderate size.
Many practical problems such as the capital budgeting problem [2], allocating processors and databases in a distributed computer system [3], project selection and cargo loading [4], and cutting stock problems [5] can be formulated as an MKP.The MKP is also a subproblem of many general integer programs.
Given the practical and theoretical importance of the MKP, a large number of papers have devoted to the problem.It is not the place here to recall all of these papers.We refer to the papers of Chu and Beasley (1998) [6], Fréville (2004) [7] and the monograph of Kellerer (2004) [8] for excellent overviews of theoretical analysis, exact methods, and heuristics of the MKP.Recently, some new algorithms for the MKP have been proposed such as some variants of the genetic algorithm [9], the ant colony algorithm [10], the scatter search method [11], and some new heuristics [12][13][14][15].Some studies on analysis of the MKP [16,17] and generalizations of the MKP [18][19][20] have also been put forward.
An Evolutionary algorithm (EA) is a generic population-based metaheuristic optimization algorithm.Candidate solutions to the optimization problem play the role of individuals (parents) in a population.Some mechanisms inspired by biological evolution: selection, crossover and mutation are used.The fitness function determines the environment within which the solutions "survive".Then new groups of the population (children) are generated after the repeated application of the above operators.
In the last two decades EAs were studied for solving the MKP.Although the early works do not successfully show that genetic algorithms (GAs) were an effective tool for the MKP, the first successful GA's implementation was proposed by Chu and Beasley (1998) [6].Extended numerical comparisons with CPLEX (version 4.0) and other heuristic methods showed that Chu and Beasley's GA has a robust behavior and can obtain high-quality solutions within a reasonable amount of computational time.Raidl and Gottlieb (2005) [17] introduced and compared six different EAs for the MKP, and performed static and dynamic analyses explaining the success or failure of these algorithms, respectively.They concluded that an EA based on direct representation, combined with local heuristic improvement (referred to as DIH in [17], i.e., GA of Chu and Beasley (1998) [6] with slight revision), can achieve better performance than other EAs mentioned in [17] from empirical analysis.
The best success for solving the MKP, as far as we known, has been obtained with tabu-search algorithms embedding effective preprocessing [21,22].Recently, impressive results have also been obtained by an implicit enumeration [23], a convergent algorithm [24], and an exact method based on a multilevel search strategy [25].Compared with EAs, the methods mentioned above can yield better results when excellent solutions are required.But they are more complicated to implement or their computation takes extremely long time.Since EAs are simple to implement and their computation time are easy to control, they are good alternatives if the quality requirement of solutions of the MKP is not very strict.
In this paper, we will consider a variant of EA to solve the MKP.This EA will use a special encoding technique which is called weight-coding (or weightbiasing).We will improve a weight-coded EA (WCEA) proposed by Raidl (1999) [26] and propose an improved weight-coded EA (IWCEA).The numerical experiments of some benchmarks will show that the IWCEA performs better than the WCEA and can compete with DIH in some benchmarks.Moreover, in the same platform, IWCEA's iterate time is shorter than other EAs listed in [17].
2 An Introduction to the weight-coding and its application to the MKP When combinatorial optimization problems are solved by an EA, the coding of candidate solutions is a preliminary step.Direct coding such as the binary coding is an intuitive method.The main drawback of this coding lies in that many infeasible solutions may be generated by EA's operators.To avoid that, the basic idea of the weight-coding is to represent a candidate solution by a vector of real-valued weights w j (j = 1, . . ., n).The phenotype that a weight vector represents is obtained by a two-step process.
Step (a): (biasing) The original problem P is temporarily modified to P ′ by biasing problem parameters of P according to the weights w j ; Step (b): (decoding heuristic) A problem-specific decoding heuristic is used to generate a solution to P ′ .This solution is interpreted and evaluated for the original (unbiased) problem P .
The weight-coding is an interesting approach because it can eliminate the necessity of an explicit repair algorithm, a penalization of infeasible solutions, or special crossover and mutation operators.It has already been successfully used for a variety of problems such as an optimum communications spanning tree problem [27], problem [28], the traveling salesman problem [29], and the multiple container packing problem [30].
To the best of the authors' knowledge, the work of Raidl (1999) [26] is the first to use weight-coded EA (WCEA) to deal with the MKP.In that paper, some variants of WCEAs were proposed and compared.And Raidl finally suggested one of them and compared the WCEA with other EAs in [17].In this WCEA, w j (j = 1, . . ., n) is set to be the weight vector representing a candidate solution.Weight w j is associated with item j of the MKP.Corresponding to Step (a), the original MKP is biased by multiplying of profits in (1a) with log-normally distributed weights: where N (0, 1) denotes a normally distributed random number with mean 0 and standard deviation 1, and γ > 0 is a strategy parameter that controls the average intensity of biasing.Raidl (1999) [26] suggested that γ = 0.05.Since the resource consumption values r ij and resource limits b i are not modified, all feasible solutions of the biased MKP are feasible to (1).

Corresponding to
Step (b), the decoding heuristic which Raidl (1999) [26] suggested is making use of the surrogate relaxation (See [31,32]).The m resource constraints (1b) are aggregated into a single constraint using surrogate multipliers a i , i = 1, . . ., m: where a i are obtained by solving the linear programming (LP) of the relaxed MKP, in which the variables x j may get real values from [0, 1].The values of the dual variables are then used as surrogate multipliers, i.e. a i is set to the shadow price of the i-th constraint in the LP-relaxed MKP.Pseudo-utility ratios are defined as: A higher pseudo-utility ratio heuristically indicates that an item is more efficient.After the items are sorted by decreasing order of u j , the first-fit strategy used as decoder in the permutation representation is applied.All items are checked one by one and each item's variable x j is set to 1 if no resource constraint is violated, otherwise, x j is set to 0. The computational effort of the Raidl's WCEA can be described as follows (we will explain the details of Steps 6, 7, and 8 afterward): Algorithm of Raidl's WCEA Step 1: set t := 0; Step In Step 6, a binary tournament selection is used.That is, two pools of individuals, which consist of 2 individuals drawn from the population randomly, are formed respectively at first.Then two individuals with the best fitness, each taken from one of the two tournament pools, are chosen to be parents.
In Step 7, Raidl (1999) [26] suggested a uniform crossover instead of one-or two-point crossover.In the uniform crossover two parents have one child.Each w j (j = 1, . . ., n) in the child is chosen randomly by copying the corresponding weight from one or the other parent.
Once a child has been generated through the crossover, a mutation step in Step 8 is performed.Each w j of the child is reset to a new random value observing log-normal distribution with a small probability (3/n per weight as in [26] or one random position in [17]).
In numerical experiments, the N in Step 2 is taken as 100 and t max in Step 5 is taken 10 6 .Raidl and Gottlieb (2005) [17] compared this WCEA with other five EAs for the MKP.From empirical analysis, this WCEA outperformed all of them except DIH (The meaning of DIH is given in Section 1) on average.
3 An Improved WCEA for the MKP

Motivation
The core of Raidl's WCEA is the surrogate relaxation based heuristic in decoding.In our points of view, this heuristic has two drawbacks.First, the dual variables of an LP-relaxed MKP used in heuristic decoding step are just good approximations of optimal surrogate multipliers and it may mislead the search [21].LP-relaxed MKP used in heuristic decoding step are just approximations of optimal surrogate multipliers.And deriving optimal surrogate multipliers is a difficult task in practice [33].Secondly, the heuristic decoding might mislead the search if the optimal solution is not very similar to the solution generated by applying the greedy heuristic [34].
In order to avoid using surrogate multipliers, we set w j (j = 1, . . ., n) to let every w j observe uniform distribution on [0, p max /p j ], where p max = max{p j : j = 1, . . ., n}.The profits of the original MKP are biased by multiplying weights: p ′ j = p j w j , j = 1, . . ., n.
as mentioned in Section II, all feasible solutions of this biased MKP are feasible to (1).In decoding heuristic, we also use first-fit strategy, i.e., the items are sorted by decreasing order of p ′ j (not by pseudo-utility ratio in (4)) and traversed.Each item's variable x j is set to 1 if no resource constraint is violated.The computational effort of the decoder is also O(n This form of w j is similar to the idea of Random-key Representation [35].Surrogate multipliers can be avoided but the efficiency of the EA will be reduced [17].To overcome this disadvantage, our thought is to obtain a "good" initial population.In the following we first introduce an idea proposed by Vasquez and Hao [21] and then propose our method. It is well known that only relaxing the integrality constraints in an MKP may not be sufficient because its optimal solution may be far away from the optimal binary solution.However, Vasquez and Hao in [21] observed when the integrality constraints was replaced by a hyperplane constraint n j=1 x j = k ∈ N, the corresponding linear programming solution may often be close to the optimal binary solution.For example in [21], in (1) we let n = 5, m = 1, p = {12, 12, 9, 8, 8}, r = {11, 12, 10, 10, 10}, b = 30.The relax linear programming problem leads to the fractional optimal solution x LP = {1, 1, 0.7, 0, 0} while the optimal binary solution is x = {0, 0, 1, 1, 1}.If we replace the integrality constraints by n j=1 x j = 3, this linear programming problem leads to the optimal binary solution.
In the above example, if we take w = {0, 0, 1, 1, 1} and substitute it to (5), the optimal binary solution can be obtained by first-fit heuristic mentioned above.Moreover, if we do not restrict k as an integer, we may also obtain some corresponding linear programming solutions from which some good binary solutions may be obtained by first-fit heuristic.We use these linear programming solutions as a "good" initial population.So the disadvantage of Random-key Representation may be overcome.The experimental results presented later have confirmed this hypothesis.Naturally, the hypothesis does not exclude the possibility that there exists a certain MKP whose optimal binary solution cannot be obtained from linear programming solutions.
Inspired by this idea, initialization is guided by the LP relaxation with a hyperplane constraint.To begin with, we use some simple heuristic (such as a greedy algorithm) to obtain a 0-1 lower bound z.Next, the two following problems: are solved to obtain k max and k min .
Then, N linear programming problems are solved where k ′ is a real number generated randomly from [k min , k max ] in each computation.So the N linear programming solutions are generated as the initial population.

Implementation
The scheme of the IWCEA is similar to Raidl's WCEA.And we take the same values of N and t max as the WCEA.The differences between the two algorithms lie in the following aspects: (1) Each w j in Raidl's WCEA observes log-normal distribution, while in IWCEA it observes a uniform distribution on [0, p max /p j ], where p max = max{p j : j = 1, . . ., n}; (2) Raidl's WCEA sorts items by pseudo-utility ratios in heuristic decoding step while the IWCEA sorts items by biased profits directly; (3) The initial population in Raidl's WCEA is generated randomly, while in the IWCEA, N linear programming problems should be solved; (4) In the mutation step, one random w j of the child is reset to a new random value observing uniform distribution on [0, p max /p j ] instead of log-normal distribution in the IWCEA.

Experimental comparison
We use two test suites of MKP's benchmark instances for experimental comparison.The first one, referred to as CB-suite in this paper, is introduced by Chu and Beasley (1998) [6] and is available in the OR-Library1 .This test suite contains 270 instances for each 10 ones are combination of m ∈ {5, 10, 30} constraints, n ∈ {100, 250, 500} items, and tightness ratio α ∈ {0.25, 0.5, 0.75}.Each problem has been generated randomly such that b i = α • n j=1 r ij for all i = 1, . . ., m. Chu and Beasley used their GA (i.e., DIH) to solve these instances and reported their results in the OR-library.The second MKP's benchmark suite2 used in [17] was first referenced by [21] and originally provided by Glover and Kochenberger.These instances, called GK01 to GK11, range from 100 to 2500 items and from 15 to 100 constraints.We call this suite GK-suite in this paper.
Although some commercial integral linear programming (ILP) solvers, such as CPLEX, can solve ILP problems with thousands of integer variables or even more, it seems that the MKP remains rather difficult to handle when an optimal solution is wanted.To CB-suit, the results in [6] showed that major instances of this suit cannot be solved in a reasonable amount of CPU time and memory by CPLEX.To GK-suit, which includes still more difficult instances with n up to 2500, Fréville (2004) in [7] mentioned that CPLEX cannot tackle these instances.Therefore, it appears that the MKP continues to be a challenging problem for commercial ILP solvers.
The best known solutions to these benchmarks, as far as we known, were obtained by Vasquez and Hao (2001) [21] and was improved by Vasquez and Vimont (2005) [22].Their method is based on tabu search and time-consuming compared with EA.Raidl and Gottlieb (2005) [17] tested six different variants of EAs, which are called Permutation Representation (PE), Ordinal Representation (OR), Random-Key Representation (RK), Weight-Biased Representation (WB), i.e.Raidl's WCEA, and Direct Representation (DI and DIH).We compare the IWCEA with these EAs except DIH first.We use all GK-suite and draw out nine instances (called CB1 to CB9) from CB-suite, which are the first instances with α = 0.5 for each combination of m and n.
For a solution x, the gap is defined as:  1 Average gaps of best solutions and their standard deviations of the IWCEA and other EAs where x LP is the optimum of the LP-relaxed problem to measure the quality of x.
We implement the IWCEA on a personal computer (Inter Core TM Duo T5800, 2 GHz, 1.99 GB main memory, Windows XP) using DEV-C++.The initial population is generated by MATLAB.The population size is 100, and each run was terminated after 10 6 created solution candidates; rejected duplicates were not counted.
Table 1 shows the average gaps of the final solutions and their standard deviations obtained from independent 30 runs per problem instance obtained by the IWCEA and other six variants.The results of other six variants come from [17].The results in Table 1 show that the IWCEA outperformed PE, OR, RK, and DI.On all instances but CB2, CB4, CB5, and GK01, the IWCEA performed equal or better than Raidl's WCEA.Especially in GK02 to GK11, the IWCEA performed much better than Raidl's method.
Table 1 also shows that the IWCEA performed averagely slightly worse than DIH.But we will point out that can yield better results than DIH in some instances.Since the best results can be obtained by CPLEX in CB-suite when {m, n} = {5, 100}, {10, 100}, and {5, 250}, we tested the other 180 instances in CB-suite.Each instance was computed 30 times and the best results were compared with the results reported in OR-library.The statistical data of the numbers that the IWCEA yielded better, equal or worse results than the results reported in OR-library is shown in Table 2. Tables 3 to 8 show the comparison of each instance.These tables show that the results of more than 50% instances can be improved by the IWCEA.  2 The statistical data of the numbers that the IWCEA yielded better, equal and worse results than the results reported in OR-library

Conclusion
We have proposed an IWCEA for solving multidimensional knapsack problems.This IWCEA has been different from Raidl's WCEA in the ways that surrogate multipliers are not used and a heuristic method is incorporated in initialization.Experimental comparison has shown that the IWCEA can yield better results than Raidl's WCEA in [26] and better results than the ones reported in the OR-library to some existing benchmarks.3 The results of CB-suite reported in OR-library (OR CB ) and the ones obtained by the IWCEA (m = 30, n = 100) 2: initialize pop(t) = {S 1 , . .., S N }, S i = (w 1 , . .., w n ) where w j is a random value following log-normally distribution as (2);Step 3: evaluate pop(t) : {f (S 1 ), . . ., f (S N )}; * ∈ pop(t) s.t.f (S * ) ≥ f (S), ∀ S ∈ pop(t);t < t max do Step 5: select {p 1 , p 2 } from pop(t); Step 6: crossover p 1 and p 2 to generate a child C; Step 7: mutate C; Step 8: evaluate C as Step 3, get P(C) and f (C); Step 9: if P(C) ≡ any P(S i ) then (that means C is a duplicate of a member of the population) Step 10: discard C and goto Step 6; end if Step 11: find S ′ ∈ pop(t) s.t.f (S ′ ) ≤ f (S) ∀S ∈ pop(t) and replace S ′ ← C; (steady-state replacement, i.e., the worst individual of population is replaced.)Step 12: if f (C) > f (S * ) then Step 13: S * ← C; (update best solution S * found) * , f (S * ).