Bitboards Are Not Magical

by Szymon Lipiński
tags: c++ java

I started writing a chess engine. The longer I keep on writing it, the longer I see that it’s a never-ending story. There is always something, which should be improved. There is always something to make in a different way, but I’m not sure if it would be better or worse. And most of the good design decisions made at the beginning, don’t look so good later.

During the journey, I learned a lot and I hope to learn much more. This time I’m going to describe what I learned about bitboards.

I have also written a java implementation of the NQueens problem. I used bitboards there and they made the program quite fast.

Basics Of A Chess Engine

The engine needs to store the current state of the game. The most complicated thing is storing the board. There are some additional things like castling rights, current move side, en passant field, etc. All the design decisions influence the speed of the program.

The main goal of a chess engine is to find the best move knowing the current state of the game. It must be as fast as possible. Most of the algorithms make some kind of a tree of possible moves. It can be very huge. The more space we waste for the game state, the lower the number of the states will be, so the engine can be worse.

Storing Board Information

In chess we have two sides, one has white pieces, the other has black ones. Even if your set has other colors like blue/green, the names are still white and black. The obvious C++ types for the implementation were:

enum Color {
    WHITE = 0,
    BLACK = 1
};

enum PieceType {
    PAWN   = 0,
    ROOK   = 1,
    KNIGHT = 2,
    BISHOP = 3,
    QUEEN  = 4,
    KING   = 5
};

The above values can be combined into this:

enum Piece {
    WHITE_PAWN   =  0,
    WHITE_ROOK   =  1,
    WHITE_KNIGHT =  2,
    WHITE_BISHOP =  3,
    WHITE_QUEEN  =  4,
    WHITE_KING   =  5,

    BLACK_PAWN   =  8,
    BLACK_ROOK   =  9,
    BLACK_KNIGHT = 10,
    BLACK_BISHOP = 11,
    BLACK_QUEEN  = 12,
    BLACK_KING   = 13,

    PIECE_NONE   = 14,
    PIECE_SIZE   = 14,

    PIECE_BEGIN  = 0,
    PIECE_END    = 14
};

The PIECE_NONE is used to avoid returning any NULL. A function like get_piece_from_this_field can return PIECE_NONE if there is no piece. It looks like this enum can be stored in 16 bits (I think it will be aligned to the full bytes). So it’s 2 bytes.

One of the most obvious ways to store the board state is to have a list of 64 fields with the Piece values in each. This way the whole board will require 128B of memory. I thought it was pretty nice. But it wasn’t.

One of the main tasks of a chess engine is to generate all possible moves. One of the main questions was: “Is there a white or a black piece on this field?”.

Other Ideas

If I remember correctly, I managed to strip the required space for all the game state to about 30B. I thought it was pretty nice. What was the cost? It was densely packed into memory. It looked more like compressed information than useful information. So, to use it, every time the program had to unpack it. It was really time-consuming, so even the memory overhead was quite small, the program was slow.

It looks like having a faster program but using more memory was the way to go.

Bitboard

Then I read about the bitboards. The idea is quite simple. We have N fields on the board. This means I could use N bits to store simple boolean value true/false for each field. But what kind of values to store?

We have 8 white pawns and there cannot be more (let’s talk about “normal” chess for now). I could use 64 bits for the whole chessboard and store true if there is a pawn on the field, and false if there isn’t. The same for black pawns.

There are 6 types of pieces for each color. The number of queens can vary from 0 to 9 (if you promote all your pawns to queens). The number of rooks, knights, and bishops can vary from 0 to 10. However, it means I’d also need a full bitboard if there wasn’t any queen (whole filled with false values). I’d also need a full bitboard for each king, just to make the operations simpler. At that moment I thought it was a waste of memory.

For storing the whole board it would require: 12 bitboards x 64 bits = 768 bits = 96 bytes.

Bitboard Implementation

The C++ implementation I used was quite obvious:

#include <bitset>
typedef std::bitset<64> Bitboard;

The bitset class has nice overloaded operators for many things. What’s more, this way it will use only 64 bits.

For the Java implementation I used:

import java.util.*;
private final BitSet bitboard;
bitboard = new BitSet(size);

Unfortunately, there is no operator overloading in Java, but the BitSet class has functions giving exactly the same functionality.

Simple Bitboard Functionality

One of the most common things I do with a bitboard is to get information about a specific field. In chess rows are called ranks and columns are called files. In chess it’s simple, we have eight files from a to h, and eight ranks from 1 to 8.

For a general bitboard, like I used in the NQueens problem, I couldn’t use letters for the files, so I used numbers. For fields I have a class like this:

class Field {
private:
    int const file;
    int const rank;
    
public:
	Field(int file, int rank): rank(rank), file(file){}
};

This gives me a nice two-dimensional coordinate system. The only problem is that I need some translation between the coordinates and the bitboard index, as the bitboard is a one-dimensional list of values.

Check out this simple bitboard. Coordinates are on the left and bottom, the bitboard index is in the fields:

  4| 20 21 22 23 24
  3| 15 16 17 18 19
  2| 10 11 12 13 14
  1|  5  6  7  8  9 
  0|  0  1  2  3  4
  -+----------------
   |  0  1  2  3  4  <-- files
  ^
  |
ranks

For general bitboard I use the coordinates starting at 0.

For a square bitboard of size N, converting the coordinates to index can be done with this formula

index = rank * N + files

and converting from index to the coordinates with:

file = index % N
rank = (int) index / N

What Bitboards Simplify

The biggest surprise for me was how simple and powerful bitboards are. For chess we have 64 fields on the board. This way the 64 bitboard fills into a CPU register on a 64bit architecture. Combining bitboards is extremely fast.

This makes a chess bitboard storing information about the positions of all the white pieces.

auto white_pieces_board =
	  white_pawn_board
  | white_rook_board
  | white_knight_board
  | white_bishop_board
  | white_queen_board
  | white_king_board;

These simple binary or operations are made on 64bit integers. The variables nicely fit into CPU registers. The instructions can be translated into one CPU instruction each.

Of course in a more general case, when we have bigger bitboard than 64 fields (and it’s common to use in chess 10x12 board for move generation) it won’t be as efficient.

Final Remarks

I know, chess engine. Some people asked me “why do you do this while there Stockfish already?”. Well, it’s not the destination which is important. It’s the journey. Learning by doing can be fun.

Some useful links: