Sudoku

Un article de Le wiki de 2 noisettes - noisette.ch.

I recently implement a backtracking algorithm in order to solve sudoku. The results were quite interessting, but really inefficiency. I thus explore another approach which I would to present on this page.

An online demo of the sudoku solver using linear programmation is available.

Sommaire

Sudoku rules

  • Every cases must be filled with a number between 1 and 9.
  • Each cases of a line must have different numbers.
  • Each cases of a column must have different numbers.
  • Each of the 9 3x3 squares must only have each number once.

Linear translation

This section will describe the method used to translates the rules below into linears rules.

Don't use !=

A straight forward translations of the constrations will be to enumerate every variables of a line and to set them not equals, shown in pseudo-code :

for (x in 1..9) {
  for (y in 1..9) {
    for (t in (y+1)..9) {
      case(x,y) != case(x,t);
    }
  }
}

for (t in (y+1)..9) express the remaining cases not alreay compared with the case(x,y), to avoid duplicating the constraints.

This approach sounds really cool, but linear programmation don't permit to express inequalities, because it form a disjoinction of intervals. Another way of thinking should be adopted.

The variables

The way I choose to represent variables is one variable per cell per value, which I would name case(1,1,1) for the case at most top left, for value 1, case(1,1,2) for the same cell with value 2 and so on. It makes 81 cases times 9 values equals 729 variables.

The value of the variable will then be 0 or 1, 1 meaning that this case has this value.

As the variables shouldn't contain ( or ), an encoding should be defined. This is also usefull to parse the produced file. In my case, I choose to name the var v__n_, where the _ are replaced respectively with the line number, the column number and the value which this variable represents.

The constraints

Using the binary values, we can express the constraionts as a sum of the variables, the equality with 1 expressing the uniquness of the value. The following LP-formatted constraints express the uniquness of the number 1 in the first line.

case(1,1,1) + case(1,2,1) + case(1,3,1) + 
case(1,4,1) + case(1,5,1) + case(1,6,1) + 
case(1,7,1) + case(1,8,1) + case(1,9,1) = 1;

The same is perform for very number and every line, which makes 81 constraints for the lines. The method for the columns are exactly the same, as well for the constraints on the 3x3 cells :

case(4,1,8) + case(4,2,8) + case(4,3,8) + 
case(5,1,8) + case(5,2,8) + case(5,3,8) + 
case(6,1,8) + case(6,2,8) + case(6,3,8) = 1;

express the uniquness of the value 8 in the 4 3x3 cell (the 3 first are these of the top line).

What's left now ?

We must add more constraints in order to have full integrity of our sudoku.

The one we shouldn't forget is to express the fact that only one value per case can exists :

case(1,1,1) + case(1,1,2) + case(1,1,3) + 
case(1,1,4) + case(1,1,5) + case(1,1,6) + 
case(1,1,7) + case(1,1,8) + case(1,1,9) = 1;

laid out that only one of the 9 values is correct for the first case.

The two last points is to set every variables to integers, and to find a function which will be maximised.

Drawbacks

The main drawback of the approach is that we cannot display intermediate state of the human solving method, as it is the case with a backtracking method. I don't have any speed comparison, which also should be difficult to make impartially, because its depends t