The tactical exercises in our new Tactics Frenzy app are collected from a variety of sources, both inside and outside our core team. Some of them are hand-picked by humans, others are generated by computer programs. This blog post is the first in a series of two posts describing how we use the Stockfish chess engine and the Chess.jl chess programming library to produce our own computer generated puzzles.
Part I (this part) explains how we find candidate puzzle positions. Part II will explain how we generate solution trees for the exercises.
Characteristics of Good Tactical Puzzles
In order for a position to be suitable as a tactical puzzle, there should be only a single solution move, at least in the initial position of the puzzle (we sometimes have to allow multiple correct moves deeper into the solution tree, or we would have to exclude too many otherwise good puzzles). Furthermore, the solution move shouldn't be too obvious.
We also want our puzzles to be practical and representative of the kind of mistakes and missed opportunities that occur in the games of chess players of all levels. Many exercises, both in books and in some other tactics training apps and websites, are biased towards flashy sacrificial combinations. Like all chess lovers, we love attractive combinations, but in practical games, those are rare. We therefore want to also include the more mundane tactics that decide most real world games.
We collect candidate puzzle positions by scanning through a database of human games, analysing every position in every game with a computer chess engine, and collecting the positions that satisfy the following requirements:
- There is exactly one winning move.
- The side to move is not in check. Positions where the side to move is in check tend to have too few legal moves to be interesting as puzzles.
- The winning move is not a mate in 1, not a promotion to queen, and not a simple capture of a hanging piece. This is an attempt to eliminate puzzles that would be too easy.
- The winning move is a sacrifice or was missed by at least one of the two players. This should usually ensure that the win is not entirely trivial to spot.
The "exactly one winning move" condition is a little problematic because it is often difficult to be 100% sure that a move is winning and that other moves are not. Our initial search may find only a single winning move, but there is a possibility that if we would let the chess engine think a little longer, it would discover that the move isn't winning after all, or that some other move is also winning. We therefore search the position several times at gradually longer thinking times, making sure the result is always the same.
Even positions that satisfy all the above requirements will not always get accepted as puzzles. There is a final step: Generating a solution tree, which will be discussed in Part II. Sometimes generating the solution tree will fail, or the tree will end up being too big or complex.
Analysing Chess Positions with Stockfish
We use the Stockfish chess engine for analysis. Stockfish is a chess engine using the UCI protocol. There are hundreds of other UCI compatible engines, and most of them could probably be used as a drop-in replacement for Stockfish in our tactics puzzle generator. We use Stockfish because it is strong, fast and free, and because one of our team members is a co-author.
If you want to understand the technical details of this blog post, it is a good idea to at least quickly skim through the UCI protocol specification. It's a simple protocol, and the specification is not long or difficult to read.
UCI chess engines are simple console programs that accept text input commands and produce text output. They are not by themselves chess playing programs, as they don't keep track of the state of a running game. Instead, they receive the game state (sometimes just a board position, sometimes a board position and a move sequence) from the host program (usually some chess playing GUI, but in our case, our puzzle generation program) before each search.
The interaction between the host program and the chess engine consists of the host program sending positions to the engine, followed by asking the engine to search the position. The search command (or go
as it is called in the UCI protocol) accepts a number of parameters for controlling when to stop the search. The host program can ask the engine to search the position for a specific amount of time, to a given depth, or for a given number of nodes (i.e. the maximum number of positions in the search tree). For our purposes when generating tactics, node limited searches are the most natural. We don't like time limited searches because we want our program to be deterministic and produce the same results on all sorts of hardware. Depth limited searches have the disadvantage that they depend a lot on the amount of material on the board: A depth 25 search could finish in an instant for a simple endgame, but take a long time for a complex middle game position.
There are two types of searches: A regular search produces a single variation representing the best play for both sides, and the associated score. A multi-PV search produces variations and scores for several (or even all) moves, not just the best one. Multi-PV searches take more time, but are more useful for our purposes, because we are looking for positions with a single winning move. A regular search could tell us if the best move is winning, but not whether some of the other moves could also be winning.
The way we use Stockfish is therefore to first do a quick node-limited multi-PV search of every position in a game. If the multi-PV search returns a single winning move, we do progressively more thorough searches (by gradually increasing the node limit), checking that the best move remains the only clearly winning move in each search.
Here's a typical example of commands sent to a UCI engine (you can try this out yourself by running Stockfish from the command line on your computer and typing the comments by hand). All lines beginning with #
are explanatory comments and not meant to be sent to the engine:
# The 'uci' command is used to initialize the chess engine, and
# is always the first command sent from the host program to the
# engine:
uci
# Set the engine to MultiPV mode with 4 lines:
setoption name MultiPV value 4
# Set up the position after the moves 1. e4 e5 2. Nf3 Nc6:
position startpos moves e2e4 e7e5 g1f3 b8c6
# Ask the engine to search this position to depth 10:
go depth 10
The engine output after the final go
command should look something like this:
info depth 1 seldepth 1 multipv 1 score cp 106 nodes 123 nps 123000 tbhits 0 time 1 pv d2d4 e5d4 f3d4
info depth 1 seldepth 1 multipv 2 score cp 61 nodes 123 nps 123000 tbhits 0 time 1 pv d2d3
info depth 1 seldepth 1 multipv 3 score cp 44 nodes 123 nps 123000 tbhits 0 time 1 pv b1c3
info depth 1 seldepth 1 multipv 4 score cp 18 nodes 123 nps 123000 tbhits 0 time 1 pv f1c4
...
info depth 10 seldepth 16 multipv 1 score cp 88 nodes 107660 nps 1922500 tbhits 0 time 56 pv f1b5 f8c5 d2d3 g8f6 b5c6 d7c6 e1g1 c8g4 b1d2 e8g8 d2b3 c5e7 c1g5 g4f3
info depth 10 seldepth 18 multipv 2 score cp 74 nodes 107660 nps 1922500 tbhits 0 time 56 pv b1c3 f8c5 d2d3 d7d6 f1e2 c5b6 e1g1 g8f6 c3a4 e8g8 a4b6 a7b6 c1g5
info depth 10 seldepth 16 multipv 3 score cp 73 nodes 107660 nps 1922500 tbhits 0 time 56 pv f1c4 g8f6 d2d4 e5d4 e1g1 f6e4 c4d5 d8e7 f3d4
info depth 10 seldepth 17 multipv 4 score cp 71 nodes 107660 nps 1922500 tbhits 0 time 56 pv d2d4 e5d4 f3d4 g8f6 b1c3 f8c5 c1e3 c5d4 e3d4 e8g8 d4f6 d8f6 d1d2
bestmove f1b5 ponder f8c5
The final result of this search is indicated by the four lines beginning with info depth 10
. The line with multipv 1
is the best move, the line with multipv 2
the second best move, and so on. There are four lines because we set the MultiPV
option to 4 when we sent the commands to the engine.
Zooming in on the best line, we see that it begins with the move f1b5
(i.e. Bb5, the Ruy Lopez), which the engine considers to be the strongest move. The cp 88
means that Stockfish (somewhat optimistically) thinks this move gives white an advantage of 88 centipawns.
The Chess.jl Library
Chess.jl is a chess programming library developed internally at Play Magnus, and available as free software. It is developed in Julia, a modern dialect of the Lisp family of programming languages. Because of its ease of use, interactivity, and high performance, Julia is ideal for experiments in computer chess.
Among other things, the Chess.jl library contains tools for manipulating chess games, positions, and moves, for importing games from PGN files, and for interacting with UCI chess engines like Stockfish.
Julia is fairly easy to read, and programmers with experience from main-stream programming languages should be able to follow the code snippets in the next section without major difficulties. In case you want to dive deeper into the language, the Julia web site has an extensive set of links for learning Julia.
In order to fully understand the code, it is useful to also have a look at the Chess.jl documentation. Read at least the brief user guide, and the section about UCI Chess Engines in the API reference.
The Puzzle Generator Program
A simplified version of our puzzle generator program can be found on GitHub. If you are impatient and don't want to read all the technical details below, you can just clone the repository, make sure you have the latest version of Julia installed, and run the program by following the instructions in the README.md
file.
This section explains how the program works. If you want, it should be possible to enter this code interactively on the REPL or in a Jupyter notebook and end up with a working program. Alternatively, clone the above GitHub repository and play around with it.
First, make sure you have the development version of the Chess.jl library installed by typing ]add Chess#master
from the Julia REPL.
We then globally import the Chess
, Chess.PGN
and Chess.UCI
modules:
using Chess, Chess.PGN, Chess.UCI
Interacting With the Chess Engine
Our first task is to write functions to start our chess engine, do multi-PV searches, and use the results of such searches to determine whether a position is won, whether there is exactly one winning move, etc.
The following function starts launches instance of Stockfish and returns the corresponding Chess.UCI.Engine value:
function startengine(enginepath="stockfish", ttsize=256,
threads=1)::Engine
e = runengine(enginepath)
setoption(e, "Hash", ttsize)
setoption(e, "Threads", threads)
e
end
This function assumes that an engine executable named stockfish
is installed and found somewhere in your $PATH
environment variable. If you want, it should be possible to replace this with the full path to Stockfish or some other UCI chess engine installed on your system.
It may seem strange that we use a default value of 1 for the threads
variable when all modern computers have multiple CPU cores and Stockfish runs significantly faster when using multiple threads. The reason is that in the case of generating puzzles, we can make even better use of parallelism by running several Stockfish instances in parallel instead, each of them using a single thread to look for puzzles.
Here is a function that asks the engine to perform a multi-PV search:
function mpvsearch(g::Union{Board, SimpleGame, Game}, e::Engine;
nodes::Int, pvs::Int)::Vector{SearchInfo}
result = SearchInfo[]
function infoaction(info::String)
info = parsesearchinfo(info)
if !isnothing(info.multipv)
info.multipv == 1 && empty!(result)
push!(result, info)
end
end
setoption(e, "MultiPV", pvs)
setboard(e, g)
search(e, "go nodes $nodes", infoaction = infoaction)
result
end
The internals of this function can be a little tricky to understand without carefully reading the multi-PV section of the UCI specification, but have little importance for the rest of this article. What is more important is to understand the input arguments and the return value.
The first parameter, g
, is either a Board
or one of the two Chess.jl game types, Game
or SimpleGame
. In the case a game is supplied, the engine will search the current position of the game. The parameter e
is a chess engine, usually one returned from a previous call to startengine
. The nodes
parameter is the numer of nodes we want the engine to search, and pvs
is the number of lines we want. A pvs
value of 4, for instance, will make the engine return lines and scores for the four best moves of the position.
The return value is a vector of SearchInfo
values. A SearchInfo
is a rather large struct, but the only slots that are relevant for our purposes are pv
(the actual main line, a vector of moves), score
(the score of this line), and multipv
(the multipv index, which is 1 if this is the best move, 2 if this is the second best move, etc.).
Given a SearchInfo
value, it is useful to be able to know whether the score it contains is a clear win, a clear loss, or close to equal. The following functions handle this:
function iswon(info::SearchInfo, threshold::Int = 280)::Bool
s = info.score
(s.ismate && s.value > 0) || (!s.ismate && s.value > threshold)
end
function islost(info::SearchInfo, threshold::Int = 280)::Bool
s = info.score
(s.ismate && s.value < 0) || (!s.ismate && s.value < -threshold)
end
function isnotwon(info::SearchInfo, threshold::Int = 100)::Bool
s = info.score
(s.ismate && s.value < 0) || (!s.ismate && s.value < threshold)
end
function isnotlost(info::SearchInfo, threshold::Int = 100)::Bool
s = info.score
(s.ismate && s.value > 0) || (!s.ismate && s.value > -threshold)
end
These functions are somewhat complicated by the fact that scores, as returned by UCI engines, are not plain numbers: They are of two different types, mate scores and centipawn scores. The iswon
function, for instance, returns true
if the score stored in the SearchInfo
is either a positive mate score or a centipawn score larger than 280 (i.e. 2.8 pawns).
In addition to iswon
and islost
, we have two similar functions isnotwon
and isnotlost
. These are used to verify the "only one winning move" condition. For a move to be considered the only winning move, we require that iswon
is true for this move, and that isnotwon
is true for all alternative moves.
Using the above code, we can write the following function, which tests if a game position allows exactly one winning move:
function onewinningmove(g::SimpleGame, e::Engine,
nodes::Int)::Bool
sr = mpvsearch(g, e, nodes = nodes, pvs = 2)
length(sr) >= 2 && iswon(sr[1]) && all(isnotwon, sr[2:end])
end
Finding Candidate Puzzle Positions
We are now ready to write a first, primitive version of our puzzle generation function. The puzzlesingame
function takes a SimpleGame
as input, and returns a vector of strings, with each string representing a chess position in Forsyth-Edwards Notation (more frequently known by the acronym FEN). Each position should have exactly one legal move.
function puzzlesingame(g::SimpleGame, e::Engine)::Vector{String}
result = String[]
# Go through the game, checking each position to see whether
# it could be suitable for a puzzle.
while !isatend(g)
# We require that a puzzle position has exactly one
# winning move. Test this by a shallow search with a tree
# size of 1 million nodes.
if onewinningmove(g, e, 1_000_000)
push!(result, fen(board(g)))
end
# Go forward to the next position in the game, and
# continue the loop.
forward!(g)
end
result
end
You can try this one out yourself, for instance by first reading a SimpleGame
from a PGN file as described in the "Working with PGN files" section of the PGN Import and Export section of the Chess.jl user guide.
The resulting positions can be inspected and analysed, for instance, by pasting the FEN strings into the FEN text field at the excellent lichess analysis board.
Filtering Out Bad Puzzles
You will quickly notice that most of the puzzles are not particularly interesting. Some are too easy, and some are probably incorrect. This is because we haven't yet implemented most of the criteria discussed in the "Characteristics of Good Tactical Puzzles" section above.
Let's try to write a function isgoodpuzzle
that takes a candidate puzzle (a position with exactly one winning move) and checks each criterion, returning true
if the position satisfies all of them. In addition to a game position and a chess engine, this function takes a nodes
parameter, because we want to call it again and again with a gradually increasing number of nodes, verifying that all criteria remain true for all node counts.
function isgoodpuzzle(g::SimpleGame, e::Engine, nodes::Int)::Bool
# If the side to move is in check, discard the puzzle.
if ischeck(board(g))
return false
end
# If the position is not won, or there is more than one
# winning move, discard the puzzle.
searchresult = mpvsearch(g, e, nodes = nodes, pvs = 2)
if !iswon(searchresult[1]) || any(iswon, searchresult[2:end])
return false
end
# If the winning move is a mate in 1, discard the puzzle.
if searchresult[1].score.ismate &&
searchresult[1].score.value == 1
return false
end
# Extract the winning move:
m = searchresult[1].pv[1]
# If the winning move is an en passant capture, discard the
# puzzle.
if ptype(pieceon(board(g), from(m))) == PAWN &&
to(m) == epsquare(board(g))
return false
end
# If the winning move is a queen promotion, discard the
# puzzle.
if ispromotion(m) && promotion(m) == QUEEN
return false
end
# Find the static exchange evaluation value of the move. This
# is a very simple estimate of the material loss or gain of
# the move, obtained by looking at all attackers and defenders
# of the destination square, without taking into account
# factors like pinned or overloaded pieces.
#
# The value will be positive for moves that look like simple
# material winning captures, negative for moves that appear to
# lose material, and zero for all other moves.
seeval = see(board(g), m)
# If the winning move is a simple material gaining capture,
# discard the puzzle.
if seeval > 0
return false
end
# If the best move looks like a sacrifice, accept the puzzle.
if seeval < 0
return true
end
# If the best move was not played in the game, accept the
# puzzle.
if nextmove(g) != m
return true
end
# If the previous position was not lost, accept the puzzle.
if !isatbeginning(g)
back!(g)
searchresult = mpvsearch(g, e, nodes = nodes, pvs = 1)
forward!(g)
return length(searchresult) > 0 &&
isnotlost(searchresult[1])
end
# If we get here, it means that the side to move found the
# winning move, the winning move was not a capture, and the
# opponent was already lost before making his last move.
# It's likely that the winning move is too easy to find and
# that this isn't an interesting puzzle, so we discard it.
false
end
The above code checks for exactly the criteria described at the beginning of this article. Let's now rewrite puzzlesingame
to make use of isgoodpuzzle
:
function puzzlesingame(g::SimpleGame, e::Engine)::Vector{String}
result = String[]
# Go through the game, checking each position to see whether
# it could be suitable for a puzzle.
while !isatend(g)
# We require that a puzzle position has exactly one
# winning move. Test this by a shallow search with a tree
# size of 1 million nodes.
if onewinningmove(g, e, 1_000_000)
ispuzzlecandidate = true
# Search gradually deeper, verifying that the position
# satisfies all criteria for a good puzzle at all
# depths.
nodes = 1_000_000
while nodes <= 40_000_000
if !isgoodpuzzle(g, e, nodes)
ispuzzlecandidate = false
break
end
nodes = round(Int, 1.4 * nodes)
end
# If the puzzle still looks OK, save it to the result
# vector.
if ispuzzlecandidate
push!(result, fen(board(g)))
end
end
# Go forward to the next position in the game, and
# continue the loop.
forward!(g)
end
result
end
The main change compared to the earlier version is the while
loop in the middle, where we call isgoodpuzzle
with gradually increasing node count, making sure it always returns true. Only if it returns true at all node counts will the position be accepted as a puzzle.
Finally, we can wrap this up in a function that reads games from a PGN file, scans them all for puzzle positions, and write them out as a text file with one FEN string per line:
function puzzlesfrompgn(pgnfilename::String, outputfilname::String)
e = startengine();
gamecount = 0
puzzlecount = 0
for g ∈ gamesinfile(pgnfilename)
gamecount += 1
for pz ∈ puzzlesingame(g, e)
puzzlecount += 1
open(outputfilname, "a") do outputfile
println(outputfile, pz)
end
end
println("$gamecount games, $puzzlecount puzzles")
end
end
Calling this function on a large PGN file of your choice should give you a text file with one FEN string per line, each of them representing a puzzle position.
The code above only generates puzzle positions, it does not build solution trees for the exercises. Part II of this blog series will explain how we generate the solutions, and how we use the solutions to eliminate the puzzles to the most interesting ones.