Recreating HearthArena Card Ranking Algorithm with Tensorflow

8 min readJan 21, 2021

In this article, I will be sharing my journey recreating the deckbuilding tool offered by HearthArena for the virtual card game Hearthstone.

What is Hearthstone and HearthArena?

For those that don’t know, Hearthstone is a strategy card game where the aim is to build a deck of 30 cards and face off against opponents, with whoever bringing their opponent down to 0 health first winning. In the Arena game mode, players draft a deck of 30 cards one card at a time, choosing between 3 cards each time.

With many strategies to win in a game of Hearthstone, players consider many factors when deciding which card to pick like:

Mana curve — when cards can be played is restricted by how much mana you have, so it’s important to have a deck that curves out (allows you to play cards every turn)
Value — when you run out of cards it’s hard to win, so it’s important to have cards that draw/generate more cards or pack a lot of value themselves
Tempo — cards with high tempo that impact the game faster when played are generally better as they can change a losing into a winning situation
Synergies — some cards work better with other cards, and some cards are anti-synergistic!

HearthArena is a website that provides players with an Arena drafting tool to help players choose between cards by assigning a score to each one (higher score indicates a better card).

How does the HearthArena Algorithm work?

The HearthArena algorithm has both human and machine computations built into it. First, people work to assign normalized scores by assessing how good a card is using some of the criteria stated above. Data of decks and winrates is also collected from players drafting decks every day. This data is then used to train a machine learning model, which is applied to make micro-adjustments to card scores.

The huge influx of data about decks and winrates allows the model to be trained to identify patterns without specifying what we are looking for. For example, if two cards synergize and increase the winrate of decks they are both in, the model is able to pick up this trend and make micro-adjustments to reflect this synergy.

My Journey Recreating the HearthArena Algorithm

I started out with the goal of mimicking what HearthArena does — given information about the cards picked so far and the current 3 choices, assign scores for each of the choices. However, in working towards that, I also ended up creating a model which assigns a score for any given deck of 30 cards. Thus, I’ll touch on that a little as well.

There were 2 main parts to the project — obtaining the data about decks and winrates and reformatting it, and building the model for predicting scores. First, obtaining the data.

With the aid of the Requests and BeautifulSoup libraries in Python, I was able to scrape HearthArena’s website for data about decks and winrates. However, the tricky part was deciding what data I would need for this supervised learning model.

As brief background, supervised learning is the training of a model based on labelled data. For example, if I wanted to predict a person’s weight (output) based on their height (input), I would need training data about both people’s heights and weights, so the data I am training on is labelled — people’s weights are known. Going back to data from HearthArena, I would be trying to predict a score (output) based on the current choices and information about the previous cards picked (inputs), so I would need all of these as training data.

However, I was only able to find information about cards and their winrates (e.g. Instructor Fireheart — 66% winrate), so I was lacking data about the previous cards picked and the corresponding scores. As such, I turned my focus towards creating a supervised learning model to predict an overall deck score (output) given a particular deck (input). For this model, I had all the data I needed — the decklists as well as the associated winrates, which could be normalized to obtain deck scores.

After extracting the data, it was necessary to format it such that it could be fed into the model. So using the Hearthstone API to extract a list of all the playable cards, the decklists were converted to the following format:

+------------------+-----+-----+-----+-----+-----+
|    Card Name     |  A  |  B  |  C  | ... |  Z  |
+------------------+-----+-----+-----+-----+-----+
| Deck #1 Counts   |   1 |   2 |   0 | ... |   1 |
| Deck #2 Counts   |   0 |   1 |   0 | ... |   0 |
| ...              | ... | ... | ... | ... | ... |
| Deck #100 Counts |   1 |   0 |   3 | ... |   1 |
+------------------+-----+-----+-----+-----+-----+#Counts refers to the number of that card in the deck
#Sum of counts across all cards for each deck = 30

Feeding these card counts and the corresponding deck scores into a Linear Regressor model, we see that there is a rather high loss (error) even when the model is trained for 5000 steps.

#Evaluation Metrics
{'average_loss': 8.609376,
 'label/mean': 75.03246,
 'loss': 86.09376,
 'prediction/mean': 74.83777,
 'global_step': 5000}#Prediction by Linear Regressor
array([[71.079025]], dtype=float32)#Actual score
array([73.3])

In contrast, the DNN (dense neural network) Regressor model outperformed the Linear Regressor model even though it was trained for a shorter time (1000 steps).

#Evaluation Metrics
{'average_loss': 3.021388e-05,
 'label/mean': 75.03246,
 'loss': 0.0003021388,
 'prediction/mean': 75.031525,
 'global_step': 1000}#Prediction by DNN Regressor (much better performance!)
[{'predictions': array([73.30318], dtype=float32)}]#Actual score
array([73.3])

Back to the Main Goal

But I wasn’t satisfied with just predicting a score for a given deck. So after poking around on HearthArena more, I found to my surprise this data for different decks.

Card choices, previous picks, and individual scores!

Great! I now had data about scores of individual cards and which cards were picked previously. To be clear, rather than following HearthArena’s algorithm of having humans manually assign each card a score and have the algorithm make micro-adjustments to the score, the model I built tries to mimic HearthArena’s algorithm without human intervention.

As a quick note, having retrieved data about 392 decks, each having 30 picks amongst 3 cards, I had 392 x 30 x 3 = 35280 data points which was computationally intensive and time consuming to reformat, but this large sample size also helped in training a more accurate model.

Once again, I reformatted the data before training both Linear Regressor and DNN Regressor models on this data. The Linear Regressor model performed quite poorly this time.

#Evaluation Metrics
{'average_loss': 1083.2366,
 'label/mean': 61.46461,
 'loss': 10832.366,
 'prediction/mean': 45.357845,
 'global_step': 2000}#Prediction by Linear Regressor (rather poor performance)
[{'predictions': array([79.04803], dtype=float32)},
 {'predictions': array([82.29574], dtype=float32)},
 {'predictions': array([77.59356], dtype=float32)}]#Actual scores
['Air Raid', 'Guardian Augmerchant', 'Holy Light']
[64.56, 64.69, 15.85]

Once again, the DNN Regressor model came out superior in predicting card scores.

#Evaluation Metrics
{'average_loss': 315.33557,
 'label/mean': 61.46461,
 'loss': 3153.3557,
 'prediction/mean': 57.02432,
 'global_step': 1000}#Prediction by DNN Regressor (much better performance!)
[{'predictions': array([67.01109], dtype=float32)},
 {'predictions': array([82.476265], dtype=float32)},
 {'predictions': array([17.470245], dtype=float32)}]#Actual scores
['Air Raid', 'Guardian Augmerchant', 'Holy Light']
[64.56, 64.69, 15.85]

The result of all this: a model that I can apply when drafting decks in Arena. After a few runs with the model, I can say it is fairly accurate at assigning scores and I mostly agree with the picks made by the algorithm.

Reflecting on the Journey

Being an avid Arena player, I’ve always wanted to dig deeper into the inner workings of the HearthArena algorithm, and with my newly learnt skills in Tensorflow, this became a reality. It still surprises me that while a good Arena player would consider many factors when making choices like tempo, value and synergies, a deep learning model built with Tensorflow doesn’t rely on you specifying what trends it looks out for. It is the epitome of senseless, mass-example-based learning, which is ironic given that the neural network architecture has its origins in biology and the human brain.

For those looking to embark on such a project themselves, a few of my more significant takeaways from this experience:

· With the pre-made estimators provided by Tensorflow, the model training, evaluation and prediction is relatively easy; the hard part is dealing with the data and dimensions

· Deciding what data you need and can reasonably obtain is far and away the biggest challenge in any project like this. Be prepared to develop an understanding of how to use APIs if one is provided

· Most importantly, believe in yourself and be willing to try. When I first started this project, it was hard to imagine being able to reach my goal, but I took a leap of faith and was pleasantly surprised by the result!

Hope this has been an interesting read and do give a clap if you liked it!

GitHub repository containing project code: https://github.com/Jareltey/Hearthstone-Deckbuilder