Home data-science Simulating and Visualizing Blackjack

### Introduction

I am sure you are familiar with the basic strategy of Blackjack. A diagram with the dealers face up card on the x-axis and the player cards on the y-axis. In each cell, the optimal strategy is shown, usually color coded to make it easier to remember. This is an example of such diagram from basicstrategycards.com:

How these are created is no secret, for each combination and strategy, thousands of hands are simulated by a computer. When we have collected enough samples, we can pick which strategy works best for a given combination.

I wanted to replicate the work necessary to create such graph; first as a technical challenge, as it turns out is not as simple as it may seem, and secondly to get a more detailed view of the odds. These readily available diagrams are hidding a lot of information, for the optimal strategy for a pair of Kings is to stand but how much worse off are we shall we choose to hit? Also, they are making a big simplification by only looking at total scores e.g. an 11 should double against a 2, regardless of how that 11 came to be: 2+9, 3+8, 4+7, etc.

### Code

I spent some time writing a blackjack odds simulator in Python and C, this latter for efficiency reasons, and I published all code in Github.

#### Python

Python is my go-to languague, and this project is no exception. I wrote classes for cards, decks, and blackjack games using the very pythonic `__getattr__`, `__le__`, `__unicode__`, etc. The result is quite readable, but amazingly slow to execute.

The code does a couple of things, first, it generates all valid card combinations for the player. Valid shall be understood as any card combination with a score of 20 or less, 21 is still valid game state, but you have no option but to stay when you reach 21. Fun fact, there are more than 60.000 different valid card combinations in a single deck blackjack game! From 2-2, all the way to 2-2-2-2-3-3-A-A-A-A.

Secondly, for each of these valid game states, we simulate a few hundred random games, and store the results in a dictionary.

The main caveat is how slow the code executes, I tried using multhreading and multiprocessing techniques to no avail. The bottleneck are the list operations that often result in copying large objects from a memory section to another.

#### C to the rescue

If you need performance, bring C into your party. It compiles directly into assembly, that is, native CPU instructions, and it forces you think about the underlying memory structures.

I wrote similar code in C to the one in the previous Python section. I didn’t write the part that generates all valid card hands in C since that executes decently fast in Python, instead I wrote the part that runs a few hundred or thousand simulations for each valid hand. In C, the equivalent code runs about 3 orders of magnitude faster. This is a bit of an unfair comparaison, since the code I wrote in C makes an explicit effort to minimize memory footprint, but it is much faster nonetheless.

### Visualizing

The results, in a table format, look like this:

```+-------------+----------------+----------+------------------+-------+--------+-------+
| player_hand | dealer_up_card | strategy | total_executions | wins  | pushes | lost  |
+-------------+----------------+----------+------------------+-------+--------+-------+
| 22          | 2              | 0        | 50000            | 17697 | 0      | 32303 |
| 22          | 3              | 0        | 50000            | 18899 | 0      | 31101 |
| 22          | 4              | 0        | 50000            | 20351 | 0      | 29649 |
| 22          | 5              | 0        | 50000            | 22328 | 0      | 27672 |
| 22          | 6              | 0        | 50000            | 21977 | 0      | 28023 |
| 22          | 7              | 0        | 50000            | 13293 | 0      | 36707 |
| 22          | 8              | 0        | 50000            | 12003 | 0      | 37997 |
| 22          | 9              | 0        | 50000            | 12101 | 0      | 37899 |
| 22          | K              | 0        | 50000            | 10519 | 0      | 39481 |
| ...         | ...            | ...      |                  |       |        |       |
| KK          | 2              | 1        | 50000            | 3498  | 510    | 45992 |
| KK          | 3              | 1        | 50000            | 3503  | 494    | 46003 |
| KK          | 4              | 1        | 50000            | 3614  | 467    | 45919 |
| KK          | 5              | 1        | 50000            | 3642  | 445    | 45913 |
| KK          | 6              | 1        | 50000            | 3787  | 402    | 45811 |
+-------------+----------------+----------+------------------+-------+--------+-------+```

I can think of a few visualizations, but one I am particularly curious about, is seeing the odds for the different strategies for a given hand.

Using D3.js, I created a heatmap with the player hand in the Y-axis, and the dealers face up card in the X-axis. For the color of each cell, I used the Lab color space, which has a luminosity component that is constant, and two independednt components, A, and B. In my case, I mapped A to the probability of winning if standing, and B to the probability of winning if hitting. I chose this color space versus classical RGB, because of the nonlinearities of the human eye, i.e. two colors could be perceived as being very similar yet be very appart in the RGB color space. Such thing is not possible in the Lab color space.

Click on a box to drill down on the simulation levels:

To play with this in full screen, go to: https://d2c7tnfswcen7h.cloudfront.net