An introduction to dynamic programming via the knight walk

Dynamic programming is a method for solving multi-stage decision problems, applicable when subsets of a problem can be combined to contribute to the ultimate solution.

Such a structure arises most naturally when the stages are moments in time, and we are trying to solve a planning problem. That is, we want to know what actions to take to get from point A to point B.

Of course, nobody wants to waste time, so we want to know the fastest way to do so (in general, pick your favorite measure of optimality). If on the fastest path there is a point C, we can find the optimal path from A to B by combining the optimal paths from A to C and C to B (the principle of optimality).

This notion is formalized in the Bellman equation, and forms the basis of dynamic programming, which can be considered a precursor to the modern study of reinforcement learning.

Although developed in the field of control, dynamic programming is now better known to computer science students as an algorithmic technique. Typically it’s applied by solving subparts of a problem, storing those results in a cache, and consulting the cache when possible. We’ll apply this technique to find the longest shortest path a knight can take between two squares on a standard chess board. The longest shortest path is an challenge because it requires computing the shortest path between all squares and taking the longest of those. We will see how to do this the obvious (and slow) way, then observe that we can apply dynamic programming – with impressive effect.

Solving the knight walk

We are interested in the longest shortest path between squares $A$ and $B$ – simply computing the longest path is ill-defined because the knight can always repeat moves along the way, making it infinite.

First we’ll need a way to represent the chess board and calculate the available moves from a particular square. We will adapt our move generation from a standard chess programming approach of representing the board using an array, because it is simple and fast.

Recall the knight moves in an L-shape, that is, two squares in one cardinal direction and one square in the perpendicular cardinal direction. Hence a knight near the middle of the board will have eight moves, which can be computed by arithmetic. For an example, assume the board is represented by a 64-element array, where the lower left corner is represented by zero and the upper right corner is 63.

Thus each knight move can be calculated by arithmetic: for instance, two moves up and one to the right would be the result of the current square plus twenty-four (for the rows) plus one (for the column). The complete list of moves is

moves = [
    -24 - 1,
    -24 + 1,
    -12 - 2,
    -12 + 2,
    12 + 2,
    12 - 2,
    24 + 1,
    24 - 1,
]

But near the edge of the board the knight is restricted, and we have to avoid letting the knight jump off the board. If we used the 64-square representation, a knight near the edge of the board would wrap around to the other side, or invoke a memory fault!

Thus we’ll extend the usual 64-square chess board (8x8) to have two layers of padding on each side, resulting in a 144-square board (12x12). If we try to make a move and the knight lands in the padding zone, we know it’s an illegal move.

def sq144to64(sq):
    rank = (sq // 12) - 2
    file = (sq % 12) - 2

    return rank * 8 + file + 1


def sq64to144(sq):
    rank = (sq - 1) // 8
    file = (sq - 1) % 8

    return 24 + 2 + (12 * rank) + file


class Move:
    def __init__(self, square) -> None:
        self.square = square
        self.parent = None

class Board:
    def __init__(self) -> None:
        self.board = [0 for _ in range(144)]
        for i in range(1, 65):
            sq = sq64to144(i)
            self.board[sq] = i

    def moves(self, from_sq: Move):
        pseudo_legal = [from_sq.square + mv for mv in moves]
        legal = [sq for sq in pseudo_legal if self.board[sq] != 0]
        return [Move(sq) for sq in legal]

Notice the Move class represents a node in a tree – the parent member refers to the proceeding move. We’ll build a tree of moves, and once we have the move that lands on the desired square, we’ll use parent to track back up that branch to see the moves that got us there.

We’ll build the tree using breadth-first search. Breadth-first search builds a tree from the start position by adding the reachable squares to a queue. We take the first square in the queue and check where we can go from there. Then those squares go to the back of the queue – behind those we expanded from the start position. This process is repeated by taking a new node from the queue. In this way, we search all squares which take $n$ moves to reach before searching those which take $n+1$ moves to reach¹.

from queue import SimpleQueue

def bfs(board: Board, from_sq: Move, to_sq: Move) -> Move:
    q = SimpleQueue()
    q.put(from_sq)

    while not q.empty():
        v = q.get()

        # Check if this is our goal.
        if v.square == to_sq.square:
            return v

        # Expand current node.
        moves = board.moves(v)
        for mv in moves:
            mv.parent = v
            q.put(mv)

    return None

Now we can calculate our plan for getting from $A$ to $B$ by carrying out this search and unwinding the resulting branch.

def knight_walk(start: int, end: int):
    board = Board()

    f = Move(sq64to144(start))
    t = Move(sq64to144(end))
    res = bfs(board, f, t)

    moves = list()
    while res.parent:
        moves.append(res)
        res = res.parent
    moves.append(res)
    moves.reverse()

    return [sq144to64(mv.square) for mv in moves]

This enables the computation of individual paths, which you can play around with it here.

Now we can tackle computing the longest of these paths. Of course, one way to do so is using the above function to compute the path between every square, and checking the longest.

def longest_walk():
    longest = list()
    for from_sq in range(1, 65):
        for to_sq in range(1, 65):
            moves = knight_walk(from_sq, to_sq)
            if len(moves) > len(longest):
                longest = moves

    return longest

In the knights demo, this is what the “Calculate longest walk” button computes, which usually takes about six seconds (on my underpowered server). And this is only a 64-square board. We want to do better!

Introducing dynamic programming

As mentioned before, dynamic programming consists of using sub-solutions to solve a bigger problem, based on the principle of optimality. Notice a given knight walk can be broken down into smaller knight walks. If the optimal walk between $A$ and $B$ passes through $C$, then we can break the problem down into computing the optimal walks between $A$ and $C$ and $C$ and $B$.

Of course, we don’t know $C$ beforehand. But as the longest_walk function iterates through from_sq and to_sq, we’re calculating optimal walks, which could be optimal subwalks in another iteration. The problem is that we’re letting them go to waste!

Let’s use the standard dynamic programming technique of introducing a cache so we can re-use these optimal subwalks between iterations. The cache takes two squares and optionally returns a Move, if that subwalk has been computed. Thus our most space-efficient option is a two-dimensional array initialized with Nones².

def longest_walk(use_cache=False):
    cache = [[None for _ in range(64)] for _ in range(64)] if use_cache else None
    longest = list()
    for from_sq in range(1, 65):
        for to_sq in range(1, 65):
            moves = knight_walk(from_sq, to_sq, cache=cache)
            if len(moves) > len(longest):
                longest = moves
    return longest

def knight_walk(start: int, end: int, cache: dict = None):
    # ...
    res = bfs(board, f, t, cache)
    # ...

def bfs(board: Board, from_sq: Move, to_sq: Move, cache: dict = None) -> Move:
    q = SimpleQueue()
    q.put(from_sq)
    while not q.empty():
        v = q.get()

        # Check if this is our goal.
        if v.square == to_sq.square:
            if cache is not None:
                cache[sq144to64(from_sq.square-1)][sq144to64(to_sq.square-1)] = v
            return v

        # Check whether we have already computed from here.
        if cache is not None and (mv := cache[sq144to64(v.square-1)][sq144to64(to_sq.square-1)]):
            return mv

        # Expand current node.
        moves = board.moves(v)
        for mv in moves:
            mv.parent = v
            q.put(mv)

    return None

Now that we’re using the cache, the calculation takes less than a second. That’s an impressive speedup for a few lines of code! The power of the cache is that we avoid re-computing results we’ve already found in previous searches.

Conclusion

While far from a comprehensive introduction, hopefully this example makes clear the essence of dynamic programming: breaking a problem down into subproblems which are easier to compute, or in this case, can be saved and re-used.

Thanks

Caden Kroonenberg for useful feedback.

Footnotes

In contrast, a chess-playing program uses depth-first search, which performs the same process as breadth-first search but with a stack instead of a queue. Hence it explores the first branch of a tree to its maximum depth before even looking at the second reachable square from the start position. This works in chess because a reasonable stopping depth can be defined which still allows a useful evaluation of that branch; in certain cases that evaluation can be used to show searching later tree branches is pointless, allowing for branch pruning. See alpha-beta search. ↩
I also tried a dictionary which stored Pairs of moves, where Pair was a class with an appropriate hash function. Switching to the array-based cache gave around a ten percent speedup. ↩

Solving the knight walk

Introducing dynamic programming

Conclusion

See Also

Thanks

Footnotes