                         SUDOKU MADE DIFFICULT

    Sudoku or 'Number Place' puzzles have made their appearence in many
British newspapers. Apparently the number place puzzle had become
popular in Japan, and the translation used two kanji: So for counting
and Doku for place, or possibly Germany.

                        WHAT IS SUDOKU
     The number place puzzle challanges the solver to fill in a nine by
nine grid with nine different symbols so that no row, column, or any of
nine 3x3 boxes contain more than one copy of the same symbol.

                        PROGRAMMING SUDOKU

    Rows columns and boxes.

    ._________________.     ._________________.     ._________________.
    |1 2 3 4 5 6 7 8 9|     |1 1 1 1 1 1 1 1 1|     |1 1 1 2 2 2 3 3 3|
    |1 2 3 4 5 6 7 8 9|     |2 2 2 2 2 2 2 2 2|     |1 1 1 2 2 2 3 3 3|
    |1 2 3 4 5 6 7 8 9|     |3 3 3 3 3 3 3 3 3|     |1 1 1 2 2 2 3 3 3|
    |1 2 3 4 5 6 7 8 9|     |4 4 4 4 4 4 4 4 4|     |4 4 4 5 5 5 6 6 6|
    |1 2 3 4 5 6 7 8 9|     |5 5 5 5 5 5 5 5 5|     |4 4 4 5 5 5 6 6 6|
    |1 2 3 4 5 6 7 8 9|     |6 6 6 6 6 6 6 6 6|     |4 4 4 5 5 5 6 6 6|
    |1 2 3 4 5 6 7 8 9|     |7 7 7 7 7 7 7 7 7|     |7 7 7 8 8 8 9 9 9|
    |1 2 3 4 5 6 7 8 9|     |8 8 8 8 8 8 8 8 8|     |7 7 7 8 8 8 9 9 9|
    |1 2 3 4 5 6 7 8 9|     |9 9 9 9 9 9 9 9 9|     |7 7 7 8 8 8 9 9 9|
    ._________________.     ._________________.     ._________________.

     The three boxes in the table show cells numbered by row, column and
box. Each of the rows, columns and boxes must contain nine different
symbols. A square which satisfies the first two conditions is known as
a "Latin Square". Some newspaper articles on Sudoku mention this fact.
A Latin rectangle is a like a Latin square but it misses a few rows, or
columns. There is a classical problem for the 2 by N Latin rectangle:
the enumeration of 2 by N latin rectangles. This is also called the hat
check problem. Guests arrive at a night club and give their hats to the
cloakroom attendant. The servant gets drunk, and when the guests come
out the servant hands back the hats at random. What is the probability
that no one gets their own hat ?  The answer is about 1/e where
e=2.71828 .. etc. The number e is sometimes known as the Euler number,
although its value was known over one hundred years before Euler. Some
newspaper articles about Sudoku have mentioned Euler. This is
remarkable new ground for the popular press and as usual the details
are scanty. Euler's best known connection with Latin squares comes from
the officers and regiments problem. An army has six ranks and six
regiments. The idea is to arrange 36 officers in a six by six square so
that no two ranks or regiments are in line. In fact this is a problem
of constructing two six by six latin squares so that the each
combination of two elements from corresponding cells fills in all of
the 36 possibilities (1,1),(1,2) to (6,6). There seems no real
connection with Sudoku.

    There are some simple observations about Sudoku.

    1. Each symbol is repeated  9 times.
    2. If each symbol has a numeric value, then the sums of elements
       in rows, columns and boxes is equal to a constant.

    3. The difficulty of a problem is related to the number of covered
       cells. If this number is less than 27, then it should be a
       simple problem to fill in the values using standard theories
       of linear equations.

    4. A problem may be solved either by logic or by brute force.
       Brute force calculation will work on most home computers.
       Some problems may take several hours.

    5. There is much published literature on linear equations.
       This literature does not get much in the popular press.

    6. The boxes may be built up from two orthogonal 3x3 latin
       squares.

    7. A sudoku position may be rotated and reflected and still
       remain valid. It is also possible to permute the symbols.

    8.  Rows and columns may be swapped as long as they remain
        in the same section. It is also possible to swap rows
        and columns in blocks of three.

    9.  2x2 and 3x3 Latin squares are easy to enumerate.
        In the 3x3 case there are only two real choices.

                    S     T    S+3*T
                  0 2 1| 0 1 2| 0 5 7
                  1 0 2| 1 2 0| 4 6 2
                  2 1 0| 2 0 1| 8 1 3

     Two orthogonal 3x3 latin squares can be used to make a sudoku
     block, by addition, multiplication.

    10. There are some very simple latin squares of order 9. These
    are just generated by writing a line consisting of each symbol, and
    then by following with lines where the next line is the previous
    line rotated one or more places. These squares are useless for
    sudoku.

                        GENERATING LATIN SQUARES

    When N is a prime, or prime power it is easy to generate a series of
N-1 latin squares by formulae. The symbols represent elements of
GF(p^n), or the Galois field of p^n elements. A field is a set with a
multiplication and addition law, and for each non zero element x of the
field there is an inverse element x' with x x' = 1. 1 is its own
inverse element. There is also a zero element.

    The addition and multiplication laws satisfy
    x+(y+z)=(x+y)+z, x(yz)=(xy)z        associative
    x+y = y+x, xy=yx                    commutative
    x(y+z) = xy+xz                      distributive

    The two elements 0 and 1 are charcterised by the rules:-
           0+x = x+0 = x   for all x
           1 x = x 1 = x   for all x.


    For each none zero element k of the field it is possible to generate
a latin square A[k] where element x,y of Ak is given by the formula
Ak[x,y] = x+ky   (or Ak[x,y] = y+kx).


                   THE FIELD OF ORDER NINE
    Since 9=3^2 is a prime power it is easy to generate a few simple
Latin squares of order 9.

     If Z3 is the set 0 1 2, it is possible to form the field of order 9
by adjoining a new element i with i^2+1=0, or the square root of minus
one. The elements of the field may be written down as numbers 0 to 8
using the following scheme.
     0     1     2     3     4     5     6     7     8
     0     1     2     i     1+i   2+i   2i    1+2i  2+2i

    The set K = GF(3^2)  defined here is a field. This field is can be
treated similar to ordinary numbers: It is possible to define lines,
circles and other algebraic curves by considering the zeros of
algebraic formulae. A latin square is a covering with nine disjoint
lines.  Addition and multiplication tables are given.

          Addition table            Multiplication
        .__________________.      .__________________.
        | 0 1 2 3 4 5 6 7 8|      | 0 0 0 0 0 0 0 0 0|
        | 1 2 0 4 5 3 7 8 6|      | 0 1 2 3 4 5 6 7 8|
        | 2 0 1 5 3 4 8 6 7|      | 0 2 1 6 8 7 3 5 4|
        | 3 4 5 6 7 8 0 1 2|      | 0 3 6 2 5 8 1 4 7|
        | 4 5 3 7 8 6 1 2 0|      | 0 4 8 5 6 1 7 2 3|
        | 5 3 4 8 6 7 2 0 1|      | 0 5 7 8 1 3 4 6 2|
        | 6 7 8 0 1 2 3 4 5|      | 0 6 3 1 7 4 2 8 5|
        | 7 8 6 1 2 0 4 5 3|      | 0 7 5 4 2 6 8 3 1|
        | 8 6 7 2 0 1 5 3 4|      | 0 8 4 7 3 2 5 1 6|
        .__________________.      .__________________.

                        GENERATING SUDOKU
    There are many algorithms for generating latin squares. Some latin
squares satisfy sudoku constraints. It is quite possible to generate
latin squares at random, and then filter these to select one which
satisfies the sudoku constraints. The problem is waiting time. It is
still possible to extend a latin square algorithm to generate sudoku.
Consider the 9x9 0 1 matrices made up of the positions of ones, twos,
threes, .. in the sudoku matrix. Each one of these matrices look like a
method of placing 9 kings a 9x9 Chinese chess board. In Chinese chess
each player has a king. Each king is confined to a 3 by 3 box, or
office. It is forbidden to have both kings to occupy the same file
unless there are intervening pieces. If there are nine kings then we
can also impose the constraint that two kings should not share the same
rank. Finding such configurations is very similar to the N-queens
problem where N non attacking queens are placed on an NxN chessboard.
Any such 0 1 matrix can be represented by a permutation. Just take the
position of the 1 in column 1, then the position of 1 in column two and
so on to get a mapping p:N9->N9 where N9 is any set of nine symbols. If
q is a second permutation for another set of Chinese kings then the
condition that no square is occupied twice is the same as the condition
the the rectangle with rows p and q is a latin rectangle. In practice
it is easy to select a configuration of rook generals and it is not too
tedious to generate a latin rectangle with six rows (out of nine).
Convert each line p[i] of this rectangle to a 0 1 matrix Xi by setting
Xi[j, p[j]]=i for j from 1 to 9. The superimposed matrices X1, X2, X3,
.. Xn then give an incomplete sudoku matrix with all the values 1 to n
filled in. If n is greater than six it may be impossible to add any
further rows at all so the algorithm goes on for brute force sudoku
completion with a cut off point where any position requiring more than
a few hundred steps is considered impossible, and a further latin
rectangle is constructed.

        Steps in construction of Sudoku matrix.

        ..1......|....2....|3........||371628954
        .......1.|........2|......3..||568749312
        ....1....|.2.......|.....3...||924513687
        ...1.....|.....2...|..3......||483162795
        ......1..|2........|.......3.||259874136
        .1.......|......2..|....3....||617935248
        1........|...2.....|........3||195287463
        .....1...|.......2.|...3.....||746351829
        ........1|..2......|.3.......||832496571

    Numbers 1 to 6 are filled in by constructing configurations of
chinese chess kings which have no cells in common. Numbers 7,8 and
9 are filled in by solving a sudoku problem.

    The chance of getting a permutation to satisfy sudoku constraints
is about 1 in 6.4. There are 9! (= 362880) different permutations
on 9 elements and of these 56656 satisfy sudoku constraints. That is
to say, there are 56656 ways of putting 9 chinese chess kings on a
9x9 chess board. The number was calculated by modifying the program
to count the ways of placing 9 queens on 9x9 board.

                           EXAMPLE
    This puzzle appeared on page 9 of  the Guardian G2 section on
13/05/05. It is the hardest puzzle that I tried, with the help of the
computer. The solution used a brute force method, and the program
examined nearly two million positions before completing the grid.

            .__.__.__.__.__.__.__.__.__.
            |  |  | 8|  |  | 2|  |  |  |
            .__.__.__.__.__.__.__.__.__.
            |  |  |  |  | 9|  |  | 4|  |
            .__.__.__.__.__.__.__.__.__.
            | 3|  |  | 1|  |  |  |  | 7|
            .__.__.__.__.__.__.__.__.__.
            |  | 6|  | 3|  |  |  |  |  |
            .__.__.__.__.__.__.__.__.__.
            |  |  | 9|  |  |  | 8|  |  |
            .__.__.__.__.__.__.__.__.__.
            |  |  |  |  |  | 7|  | 2|  |
            .__.__.__.__.__.__.__.__.__.
            | 4|  |  |  |  | 6|  |  | 5|
            .__.__.__.__.__.__.__.__.__.
            |  | 2|  |  | 4|  |  |  |  |
            .__.__.__.__.__.__.__.__.__.
            |  |  |  | 8|  |  | 1|  |  |
            .__.__.__.__.__.__.__.__.__.

             The complete solution.

            .__.__.__.__.__.__.__.__.__.
            | 7| 9| 8| 4| 3| 2| 5| 1| 6|
            .__.__.__.__.__.__.__.__.__.
            | 5| 1| 6| 7| 9| 8| 3| 4| 2|
            .__.__.__.__.__.__.__.__.__.
            | 3| 4| 2| 1| 6| 5| 9| 8| 7|
            .__.__.__.__.__.__.__.__.__.
            | 2| 6| 4| 3| 8| 9| 7| 5| 1|
            .__.__.__.__.__.__.__.__.__.
            | 1| 7| 9| 2| 5| 4| 8| 6| 3|
            .__.__.__.__.__.__.__.__.__.
            | 8| 3| 5| 6| 1| 7| 4| 2| 9|
            .__.__.__.__.__.__.__.__.__.
            | 4| 8| 1| 9| 7| 6| 2| 3| 5|
            .__.__.__.__.__.__.__.__.__.
            | 9| 2| 3| 5| 4| 1| 6| 7| 8|
            .__.__.__.__.__.__.__.__.__.
            | 6| 5| 7| 8| 2| 3| 1| 9| 4|
            .__.__.__.__.__.__.__.__.__.


      This problem was worked out while I was sleeping. I had drunk a
most of a bottle of cheap Spanish white wine, bought from Hallam Booze.
I had tweaked the SU_BRUTE function to shuffle branches of the move
tree. Otherwise SU_BRUTE would spend ages checking out useless
possibilities. A skilled human could easily outperform the program for
harder Sudoku puzzles.

                        THE PROGRAMS

     It is possible to use the programs to solve Sudoku on both Windows
and Linux systems. The programming language is based on APL, one of the
first ever object oriented languages, invented in the 1950s by Kenneth
Iverson. A Sudoku position is represented by a 9x9 character matrix
consisting of dots for hidden cells and numbers for open cells.

    These functions may be used in aliases so that the options are
passed as in unix/linux shell scripts. A sudoku position may appear on
the command line as a sequence of 81 dots, or digits 1-9.

    Z<-SUGO H           solve very easy problem.

    Z<-SU_TAHLIA H      tries all single guesses.

    Z<-SU_BRUTE "-depth n -shuffle -v ",,H
    Brute force guessing employing recursion.
    Options
    -cnt   number       Maximum number of positions to examine
    -depth number       Search depth.
    -v                  Save transcript
    -shuffle            Randomize the guessing

    Z<-SU_BUILD
    Make a 9x9 sudoku matrix.

    To run the programs you need sudoku data. A whole batch of sudoku
problems may be entered in an ascii text file. Each sudoku has a name.
This is entered in the file as a bunch of non blank characters starting
with a "#". The next nine lines comprise the soduku: nine lines
consisting of digits or dots (periods) for unknowns.

    The sudoku is stored as a 9x9 matrix in memory by the command
    H<-GET_SUDOKU "file#name"

    The filename string is parsed for the '#' character. Symbols before
this make up the file name, and symbols after this represent the
sudoku. The syntax is similar to that for writing a URL string
(Universal Resource Locator) as a file name and then a link within the
file.

    To make a problem use the function 
    H<- MK_PUZZLE"-open num -seed num"

    The options should be numbers indicating the number of open
or displayed cells, and a random seed. The puzzles are not beautiful,
but a later version will use elliptic curves on GF(3^2) to generate
the dots or blanks.

     MK_PUZZLE
 .__.__.__.__.__.__.__.__.__.
 | 7|  | 5|  | 8|  | 1|  | 6|
 .__.__.__.__.__.__.__.__.__.
 | 2| 8|  |  | 3|  | 4| 7|  |
 .__.__.__.__.__.__.__.__.__.
 | 1| 4| 6|  |  |  |  |  |  |
 .__.__.__.__.__.__.__.__.__.
 | 5|  | 8| 3|  |  |  |  |  |
 .__.__.__.__.__.__.__.__.__.
 | 6| 9|  | 5|  |  |  |  |  |
 .__.__.__.__.__.__.__.__.__.
 |  | 1|  |  |  |  | 2|  |  |
 .__.__.__.__.__.__.__.__.__.
 |  |  |  | 9|  |  | 5|  | 4|
 .__.__.__.__.__.__.__.__.__.
 | 9|  | 2|  |  |  |  |  |  |
 .__.__.__.__.__.__.__.__.__.
 | 4| 5|  |  | 1|  |  | 9|  |
 .__.__.__.__.__.__.__.__.__.

                MORE TOUGH PROBLEMS FOR MONEY

    RSA laboratories [1] are offering up to $625000 for solving some simply
stated numerical problems. You simply have to work out p and q
in exprssions such as pq=N where N is a number typically with several
hundred digits and p and q are factors of that number. When you download
the Sudoku solver package you can also try and complete the RSA chhallenge.
Source code is included for an elliptic curve factorisation method
which is supposed to be OK with a following wind and some good luck.

 [1] http://www.rsasecurity.com/rsalabs/challenges/
     factoring/challengenumbers.txt

    <-(C) Tony Goddard    http://d4maths.lowtech.org/sudoku.htm
    copyleft






