Sunday, July 23, 2023

Sudoku Triathlon

Way back in July 2015 I wrote a sudoku solver in Python on the Raspberry Pi. Nowadays I’m into Rust, so I decided to write a rusty code. There are a couple of sudoku solving techniques that you can use, but oddly the way humans solve sudoku puzzles tends to be counter-productive to implement in a computer program, as I found out.

As an additional challenge, let’s solve a sudoku triathlon, which is a puzzle where three sudoku’s have been merged, and certain blocks overlap. Just to be clear: it’s not too difficult to solve this puzzle by hand. The challenge is to solve it by code, and how to do it in Rust — that is the real puzzle.

Representation

The first problem to tackle is how to represent this thing in memory. A regular sudoku puzzle is easily coded as a 9×9 array, but here we have three of them. Moreover, certain blocks are overlapping. While you can code it as three separate sudoku’s, I really wanted it to automatically update the cells of the overlapping blocks. Possibly this could be implemented with some kind of notification system, but if I were to program this thing in C++ I would just use pointers to the start of the 9×9 sudoku puzzle in memory. When I tried doing that in Rust I ran into fights with the borrow checker because of multiple shares of mutable memory. In the end, the puzzle was implemented not as 9×9 arrays, but as a full 15×21 array. Empty cells are marked as zero, and invalid cells are marked as 255. I guess you could make it super Rust-y and use Option<SudokuCell>, but I went with u8 for simplicity.

#[derive(Clone, Debug, Default)]
struct Triathlon {
    dim: usize,
    block_dim: usize,
    width: usize,
    height: usize,
    cells: Vec<u8>,
}

In this structure the dim dimension is 9, the block dimension is 3, width is 15, and height is 21. There are some methods:

// Returns cell at (x,y)
// x,y are "Triathlon" coordinates
// that encompass the entire puzzle
fn get(&self, x: usize, y: usize) -> u8

// Write value to cell at (x,y)
// x,y are "Triathlon" coordinates
// that encompass the entire puzzle
fn put(&mut self, x: usize, y: usize, n: u8)

// Print puzzle to screen
fn show(&self)

// Returns true if puzzle is solved
// ie. when there are no empty cells left
fn is_solved(&self) -> bool

// Returns true if puzzle is valid
// meaning each row|column|block contains value 1..=dim
// exactly only once, or else an unknown value (zero)
fn is_valid(&self) -> bool

Even though we defined the puzzle as a 15×21 grid, many times it’s still convenient to approach it as being three 9×9 sudoku puzzles. For example, is_valid() checks whether the rows, columns, and blocks of each individual 9×9 sudoku are valid. In order to do that, we look at the puzzle through a SudokuView, which is basically a pointer in disguise:

// SudokuView are only offsets;
// to a 9x9 view into the Triathlon.cells
#[derive(Clone, Copy, Debug, Default)]
struct SudokuView {
    x_offset: usize,
    y_offset: usize,
}

There are no other members here; we can use dim and block_dim from the puzzle structure.

And for obtaining such a view we have a method:

// Returns SudokuView for given "Triathlon" coordinates
fn get_view_xy(&self, x: usize, y: usize) -> SudokuView

While it’s certainly possible (and seems logical) to add methods get(x,y) and put(x,y,n) to a view, my final implementation doesn’t need them.

Implementing the Solver

In the first code that I wrote, I just dumped everything into one big struct Sudoku and, well, your mileage may vary. Such a coding style leads to borrowck problems in Rust. It is better and cleaner code to break structs up into smaller separate structs. And therefore the Solver is its own thing, dealing with a Triathlon puzzle. But before showing what it looks like, let’s examine some strategies for solving sudoku puzzles.

Solving method #1 : masking

Masking means covering up, temporarily painting cells black so that you don’t see them. Suppose there is a number 4 in the first row, there is a number 4 in the second row, then by masking out all rows and columns (and blocks) where the number 4 occurs, maybe we can reveal a single spot where the number 4 should go. So, if masking out a number turns up only a single empty cell, then the number must be the solution to that cell.

Masking is a very ‘human’ way of looking at sudoku puzzles, and it only solves the easy ones. At harder levels masking alone is not sufficient to solve the puzzle.

Solving method #2 : matching possibility sets

Suppose you have row in which there are only two empty cells left. For both cells, you know that they must contain either 2 or 7. These are then a matching set of [2,7]. This also implies that there can be no other cells in that row containing 2 or 7. This rule also holds true for other dimensions; if there is an entirely empty row, but you know that there are three cells containing the same possibility set [1,3,8], then you know that the other six cells do not contain the numbers 1, 3, 8. You can use this knowledge to narrow down the possibility set for an empty cell to finding the solution for that cell.

Describing this in a more general way:

If a row, column, or block contains N identical possibility sets with length N, then the other cells (and their respective possibility sets) in that row, column, or block can not contain any of the values present in the given set.

Looking at matching sets is a very human thing to do, although it gets tough with larger sets. On top of that, this method does not solve the hardest sudoku puzzles.

Solving method #3 : semi-brute force with backtracking

A computerized way of solving sudoku is brute forcing it: simply trying all possible solutions. Rather than stupidly generating number 1 to 9 in each and every cell, look at what possibilities there are, and see if they work. When the puzzle becomes invalid (we hit a roadblock), go back to when it was valid, and try the next possible solution. The process of going back is called backtracking.

Backtracking works with a stack; you push the entire ‘state’ onto the stack, and pop from the stack when going back. You are forgiven for thinking “recursive function” at this point; we’ll just use a Vec as stack and implement a non-recursive flat loop.

This method always works and cracks even the most evil sudoku puzzles. The process can be speeded up by combining with method #2 (matching sets), but I didn’t bother.

Writing the Solver code

As said, the solver is a stack of ‘state’:

struct Solver {
    stack: Vec<SolveState>,
}

The big question is what this state looks like. The state holds all information about the puzzle and where the solver is at during the process of solving it. In other words, it holds a copy of the puzzle in its half-finished form, it holds possibility sets for each cell, and it holds some kind of iterator denoting where we left off. The SolveState is a frozen form of the memory in-use by the Solver.

#[derive(Clone, Debug, Default)]
struct SolveState {
    puzzle: Triathlon,            // clone of entire puzzle
    solutions: Vec<Vec<u8>>,      // possible solutions per cell
    index: (usize, usize, usize), // (x,y,i) of chosen possible solution
}

When we make a new SolveState from a Triathlon puzzle, it determines the possible solutions for each empty cell. The index are literally the loop counters of the inner solver loops:

// pseudo code

solver:
while y < height {
    while x < width {
        while i < solutions_at(x, y).len() {
            state.index = (x, y, i)
            push(state)

            try_solution(x, y, solutions[i])

            if solved() {
                return Ok()
            }

            if is_invalid() {
                state = pop()
                (x, y, i) = state.index
                continue solver
            }
            break
        }
        i = 0

        if get(x, y) == 0 {
            // invalid; unable to solve cell
            state = pop()
            (x, y, i) = state.index
            continue solver
        }
        x += 1
    }
    x = 0
    y += 1
}

This is a generic backtracking solver algorithm that will work for any kind of puzzle that involves finding a solution in a sea of possible combinations.

As proof of work, here is the solution to the sudoku triathlon pictured above:

5  2  1  |  9  7  4  |  3  8  6
3  9  8  |  1  5  6  |  7  2  4
6  7  4  |  2  8  3  |  1  5  9
---------------------------------
7  1  9  |  3  4  5  |  2  6  8
2  5  6  |  7  1  8  |  4  9  3
8  4  3  |  6  2  9  |  5  1  7
---------------------------------
1  8  2  |  4  6  7  |  9  3  5  |  8  1  2
9  6  7  |  5  3  1  |  8  4  2  |  7  6  9
4  3  5  |  8  9  2  |  6  7  1  |  5  3  4
---------------------------------------------
            7  5  3  |  2  1  6  |  9  4  8
            6  4  8  |  3  9  7  |  1  2  5
            1  2  9  |  5  8  4  |  3  7  6
            ---------------------------------
            9  7  4  |  1  6  8  |  2  5  3  |  4  9  7
            3  1  5  |  4  2  9  |  6  8  7  |  3  5  1
            2  8  6  |  7  5  3  |  4  9  1  |  6  8  2
            ---------------------------------------------
                        3  8  4  |  5  2  9  |  7  1  6
                        9  1  5  |  7  3  6  |  8  2  4
                        6  7  2  |  1  4  8  |  5  3  9
                        ---------------------------------
                        8  9  1  |  3  6  4  |  2  7  5
                        2  4  7  |  8  1  5  |  9  6  3
                        5  3  6  |  9  7  2  |  1  4  8

This thing would look a lot nicer in a GUI, but hey.

The code can be adapted to solve sudoku samurai, which is a larger puzzle consisting of five overlapping sudoku grids.

My sister was so kind to mail me a sudoku from a local newspaper, which was a combined jigsaw (chaos) sudoku and a partly overlapping diagonal sudoku. Okay … I need some more time to implement a solver for that one.