An Integer Programming Model for the KenKen Problem

In this paper we consider modeling techniques for the mathematical puzzle KenKen. It is an interesting puzzle from modeling point of view since it has different kinds of mathematical restrictions that are not trivial to express as linear constraints. We give an integer program for solving KenKen and its implementation on modeling language AMPL. Our integer program uses prime number factorizations for converting product restrictions into linear constraints. It can be also used for teaching various integer programming techniques in an Operations Research course.


Introduction
A KenKen puzzle is a grid of n by n cells (see Figure 1). The goal is to fill the whole grid with numbers 1 to n, making sure no number is repeated in any row and column. An additional feature of KenKen (compared to a similar puzzle Sudoku) is that the grid is partitioned into "cages". Each cage consists of several adjacent cells. The top left corner of each cage has a target number and an arithmetic operation (sum, difference, product, ratio). The numbers entered into a cage must combine (in any order) to produce the target number using the arithmetic operation. An 8 by 8 example of KenKen is given in Figure 1. This is a real example from KenKen website [1].
Among similar puzzles, KenKen is particularly interesting for mathematicians. Thanks to its mathematical constraints, it creates a different level of interest and challenge for the solver. It is also a more challenging task to create a mathematical model that can solve the puzzle.
While Sudoku has been studied extensively, not much research has been done on KenKen. [2] shows how ideas from number theory can be used to solve KenKen. [3] discusses how KenKen can be used to develop rea- soning skills for different levels of students. In this paper we give an integer programming model for solving KenKen. The Latin square constraints were given before for Sudoku [4] [5]. Our contribution is giving linear constraints for sum, difference, product, and ratio restrictions. The product restrictions are the hardest ones for expressing by linear constraints. We give a non-standard way of using prime number factorizations for product constraints. The other parts of the model use different integer programming techniques, such as Either-Or constraints, converting absolute values to linear constraints, using auxiliary binary variables. Thus, the model could be an example for teaching different integer programming techniques. We implemented the model on optimization modeling language AMPL [6] and tested on examples. The implementation requires different data structures needed for the model, such as prime number factorization of a number. The AMPL techniques also can be a good teaching tool on how to implement an integer programming model with an optimization software.
The paper is organized as follows. Section 2 gives the set of constraints that make the solution a Latin square. Sections 3, 4, 5, 6 cover sum, difference, ratio, product constraints correspondingly. Section 7 gives the AMPL model and an analysis of the solution. An Appendix section gives more details on a secondary method for converting product restrictions to linear constraints.

Latin Square Constraints
First we need to give constraints to provide that the numbers filling the grid form a Latin square, that is, each number occurs exactly once in each row and exactly once in each column. This type of constraints was given before for Sudoku puzzles (reference). The following binary variables are essential for giving those constraints.
The following set of constraints provide that each cell (i, j) gets exactly one value k from 1, , n  .
The following set of constraints provides that there is exactly one number k in each row i: The following set of constraints provides that there is exactly one number k in each column j:

Addition Constraints
While the binary variables defined are useful for giving the Latin square constraints, there is not a good way of using them to express the arithmetic restrictions by linear constraints. As we will see below, it is more helpful to define the following general integer variables for those constraints. Let y ij be the numerical value assigned to cell (i, j). The possible values to be assigned to y ij are numbers 1 to n. This new set of variables can be easily used to express the summation restrictions by linear constraints. For a cage C having a target number t and arithmetic operation "+", the constraint is For example, the cage consisting of cells (3,1), (3,2) and (3,3) in Figure 1 needs the following constraint:  The constraints (3.1) and (3.2) together provide that y ij takes the value k for which the corresponding variable x ijk is equal to 1.
Note that constraint (3.2) is an example of an integer programming technique to provide that a variable takes one of the given values, 1 to n in this case.
Constraint (3.2) provides that y ij is a sum of products of integers, and thus it can take only integer values. Therefore, there is no need to require variables y ij to be integers since (3.2) will guarantee it. This observation reduces the number of integer variables in the model, thus making the solution process more efficient.

Difference Constraints
A difference restriction is given for two adjacent cells with a target number d. For example, if the cells are (i, j) and (i, j + 1) then the restriction is that either i j ij y y d + − = . This is an example of Either-Or constraints (reference). While each of the two equalities is a linear constraint, their Either-Or combination is not. It should be replaced by an equivalent set of constraints where all of them must hold. It is normally done by introducing a new binary variable and using the big M method [7]. However, that method works when each of the Either-Or constraints is an inequality. In the case of equalities, first each of them should be switched to a pair of equivalent inequalities and only then applying the standard technique. But in this case there is a simpler way of switching to linear constraints as explained below.
Note that requiring to have one of The absolute value makes the constraint nonlinear. But it has the following equivalent linear constraint

Division Constraints
A division restriction is given for two adjacent cells with a target number r. For example, if the cells are (i, j) and (i, j + 1) then the restriction is that either , 1 ij i j y y r + = or , 1 i j ij y y r + = . This is another example of Either-Or constraints. But unlike the difference constraints, there is no simple way of switching it to equivalent linear constraints. Below is given the sequence of steps for achieving linearity.

Product Constraints
Product restrictions are the hardest ones for converting to linear constraints. Simply requiring that the product of y-variables is equal to the target number makes it a nonlinear constraint. We suggest two different ways of giving linear constraints for product restrictions.
Our main method gives a non-standard way of covering all product restrictions by using prime number factorizations of target numbers. That method is covered in Subsection 6.1 and is the basis of the AMPL model given in Section 7.
The second method is more intuitive and can be used to practice standard integer programming techniques. The disadvantages of this method are: (i) auxiliary binary variables are needed to write the constraints which makes the solution of the integer program less efficient; (ii) more importantly, while the method covers the most common situations it is not clear how to extend it to the general case. But this method is even more intuitive and simple than method 1 for the most common product restrictions when a cage consists of two cells (3 out of 5 product cages in the example of Figure 1 consist of two cells). Thus, one can use a combination of methods 1 and 2 when building the model. An idea how method 2 works is given in Subsection 6.2 for a special case; the rest of the discussion is in Appendix A.

Method 1 for Product Restrictions
The method first finds the prime number factorization of the target number. The puzzles in [1] are of size at most 9 × 9. For that kind of puzzles, the prime numbers in the factorizations are 2, 3, 5, 7. In this section we will give constraints for each of those prime numbers for a puzzle of size 9 × 9. But the constraints can be easily generalized to any problem size and any prime number.
-Product constraint for 5: Suppose the power of 5 in the prime number factorization is d (could also be 0). Then the following constraint should be added.
The constraint provides that the number of cells in cage C getting 5 is d.
-Product constraint for 7: The constraint for 5 can be easily extended to 7. Suppose the power of 7 in the prime number factorization is d (could also be 0). Then the following constraint should be added.
The constraint provides that the number of cells in cage C getting 7 is d.
-Product constraint for 3: Suppose the power of 3 in the prime number factorization is d (could also be 0). Then the following constraint should be added.
It is similar to the previous case except that each entry 9 in the cage contributes 2 to the total power of 3, thus the coefficient of x ij9 is 2.
-Product constraint for 2: Suppose the power of 2 in the prime number factorization is d (could also be 0). Then the following constraint should be added.
Here each entry 4 contributes 2 and each entry 8 contributes 3 to the total power of 2. Thus the coefficient of x ij4 is 2, and the coefficient of x ij8 is 3.
-A complete example: Suppose the target number is 2520. Its prime number factorization is 2 3 × 3 2 × 5 1 × 7 1 . Then the following set of constraints is needed.
Note that there is no easy way to extend Method 2 (described in Subsection 6.2 and Appendix A) to this example.
General product constraint: Let P be the set of prime numbers used in a puzzle of size n × n. For a prime number p ∈ P we define the following three sets.
M Note that there are no higher powers of p in actual Kenken puzzles; but the technique can be easily generalized to higher powers too.
Let power[t, p] be the power of p ∈ P in prime number factorization of target number t. The parameter power[t, p] is recursively computed in the parameters section of the AMPL code. Then we have the following general product constraint for p.
This general constraint represents the product constraints in our AMPL code.

Method 2 for the Case When the Cage Consists of Two Cells and There Is a Single Factorization for the Target Number
Most product restrictions in KenKen are given for two adjacent cells, and there is a single factorization for the target number. It happens when (i) the target number is a prime number, namely, 2, 3, 5, 7; (ii) the target number is composite but the size of the puzzle implies a single factorization; for example, when the target number is 4, 9, 10, 14, 15, 16, 20, 21 in puzzles of size at most 9 × 9.
Case (i). Suppose the cage consists of two adjacent cells (i, j) and (i, j + 1), and the target number is a prime number p. Then the restriction is the following: where u is an auxiliary binary variable. When u = 1, (6.2.2) and (6.2.3) together imply x ijp = 1 , x i,j+1,1 = 1 , x ij1 = 0 , x i,j+1,p = 0; thus cell (i, j) gets value p, and cell (i, j + 1) gets value 1.
Technique 2: The second way is less intuitive but simpler since it is given by just one constraint without using any auxiliary variables. Another advantage of this second way is that it does not need any new auxiliary binary variables. Constraint (6.2.4) does not work by itself but rather with the combination of other constraints we introduced before. Recall that constraint (2.1) provides that each cell gets exactly one value k from 1, , n  ; that constraint for cells (i, j) and (i, j + 1) are given below: Also, constraint (2.2) provides that there is exactly one number k assigned to each row i; that constraint for row i and numbers 1 and p are given below: -x ijp = 1 , x i,j+1,1 = 1, x ij1 = 0 , x i,j+1,p = 0; thus cell (i, j) gets value p, and cell (i, j + 1) gets value 1.
-x ijp = 0 , x i,j+1,1 = 0, x ij1 = 1 , x i,j+1,p = 1; thus cell (i, j) gets value 1, and cell (i, j + 1) gets value p. Case (ii). Suppose the target number is composite but the size of the puzzle implies a single factorization. For example, if the target number is 30 for a puzzle of size 9 × 9 then the only factorization is 30 = 5 × 6. The solution for this case is identical to case (i) by taking 5 and 6 instead of 1 and p. Example.

The AMPL Model and Its Solution
In this section we give the full AMPL model for the integer program developed in previous sections. The model has comments for most of the parameters, sets, variables, constraints; more detailed explanations about how they work are given in Sections 2-6. We also give a data set for the example of Figure 1 and its solution.
The model is given in Section 7.1. The data set is in Section 7.2. A brief analysis of the solution process and efficiency follows in Section 7.3.   "product" 2 1 "difference" 3 2 "ratio" 4 13 "sum" 5 1 "difference" 6 2 "ratio" 7
It would be more efficient to solve the LP-relaxation of our integer program. But our computations show that the LP-relaxation does not always return an integer solution. It is an interesting open question if there are any techniques to achieve integrality. A possible technique could be the following.
The integer program does not need an objective function. Thus, one has a flexibility of adding an appropriate objective function. It is an interesting open question if there is an objective function that would make the optimal solution of the LP-relaxation integral.
The existence of appropriate cutting planes that would make the solution process more efficient and perhaps make the optimal solution of the LP-relaxation integral is another open question.

Appendix. Method 2 for Product Restrictions
In this appendix, we give a further discussion on Method 2 for product restrictions.

A1. The Subcase When the Cage Consists of Two Cells and There Are More Than One Factorization for the Target Number
In puzzles of size at most 9 × 9, if a product cage consists of two cells and the target number is composite then at most 2 different factorizations are possible. It is easy to verify it for all possible composite target numbers. Namely, numbers 4,9,10,14,15,16,20 As in Subsection 6.1, there are two ways to convert the restriction to linear constraints. And again one of the methods is less intuitive but more efficient since it requires fewer constraints and auxiliary binary variables. But for the sake of comparison and completeness, we will give both methods, starting from the more intuitive one.
The "1-out-of-4 must hold" constraint is equivalent to the following set of linear constraints: where u 34 , u 43 , u 26 , u 62 are auxiliary binary variables. Constraint (A.6) implies that exactly one of the u pq variables takes value 1 while others are zero. When u pq = 1, the corresponding x ijp and x i,j+1,q variables also take value 1 while other x-variables in (A.2)-(A.5) are forced to be zero; thus cell (i, j) gets value p, and cell (i, j + 1) gets value q.
Method 2: The first step of this method is the same as in method 1.
where u is an auxiliary binary variable.

A2. The Subcase When the Cage Consists of K Cells in the Same Row (Column) and There Is a Single Factorization for the Target Number
The technique discussed in this section is the extension of the technique for 2-cell cages discussed in subsection 6.1. The reason that this combination of constraints works is the same that we had for two-cell cages with a single factorization as discussed in Subsection 6.1.
Here is a specific example to illustrate how constraint (A.9) works. Suppose the cells in the cage are (1, 1), (1, 2), (1,3); the target number is 18, and thus the single factorization is 18 = 1 × 3 × 6. The constraint (A.9) in this case is But we want constraint (A.10) to be satisfied by only one of the factorizations. Note that for the other factorizations the left-hand side of (A.10) is not necessarily 0 since the same factor m could be in more than one factorization. The following set of constraints takes the above considerations into account. Note that the left-hand side of (A.11) cannot be more than k because of constraints (2.1) and (2.2). Thus, (A.12) will force that the left-hand side of (A.11) is equal to k for exactly one factorization; for other factorizations, the left-hand side of (A.11) is ≥0, hence not forcing anything on its x-variables.
Here is a specific example to illustrate how constraints (A.11)-(A.12) work. Suppose the cells in the cage are