MMR Distribution in Starcraft II

The SC2API package is an R wrapper for Blizzard’s Starcraft II API. As an introduction to the package, it is an interesting and useful exercise to visualize the distribution of Matchmaking Rating (MMR) of the top Starcraft II players. For this exercise, we will limit our scope to the European region and investigate its Grandmaster league (i.e. the top 200 players in the region, although there may be less than 200 at any given time depending on player activity).

To begin, we will load the SC2API package and ggplot2, another package that is extremely useful for plotting and visualization.

library(SC2API)
library(ggplot2)

Authorization

Before using the “get” functions of SC2API, we must first set an OAuth 2.0 access token as an environment variable so that all future API calls may use it for authorization. To receive this token, create a client in the Blizzard Developer Portal and obtain a valid client ID and client secret. For more information on getting started, refer to Blizzard’s Getting Started document.

Once a client ID and client secret has been obtained, we will use set_token to set the token as an environment variable. Of course, you must replace the arguments with your own id and secret.

set_token("YOUR CLIENT ID", "YOUR CLIENT SECRET")

Data Retrieval

To obtain the MMRs of players in the European Grandmaster league, we will first use the get_league_data function to obtain the ladder ID of the Grandmaster ladder in the current season (season #44).

data <- get_league_data(season_id = 43, 
                        queue_id = 201, 
                        team_type = 0, 
                        league_id = 6, 
                        host_region = "eu")

The queue_id argument “201” refers to the fact that we are looking for 1v1 in the current expansion (Legacy of the Void), team_type means that the teams are arranged beforehand (not particularly useful since this only applies to team games), and a league_id refers to the league of Grandmaster. To see other choices for these arguments, see the help documentation in ?get_League_data.

Since leagues are often separated into tiers and further separated into divisions, we must specify which ladder ID we are actually looking for. Of course, there is only one tier and one division for Grandmaster league and so finding the appropriate ladder ID is relatively easy. For other leagues, finding a particular ladder ID may be slightly more difficult but can be completed using list indexing.

ladder_id <- data$tier$division[[1]]$ladder_id
ladder_id
#> [1] 225291

Now that we have a ladder ID, we can supply it as an argument to the get_ladder_data function to discover the MMRs of players within the league. The host_region argument must be set to “eu”" since that is the region with which we are currently concerned.

ladder_data <- get_ladder_data(ladder_id = ladder_id, host_region = "eu")

Then, we can extract players’ MMR from ladder_data.

mmr <- ladder_data$team$rating
head(mmr)
#> [1] 7464 7151 7075 7004 7002 6882

Grandmaster MMR in the Current Season

To retrieve the MMRs of players in past seasons, the above functions are necessary. However, since the grandmaster league is, in some sense, special, there is an alternative function we can use to retrieve the MMRs of players currently in grandmaster (that is, in the current season):

ladder_data_current <- get_gm_leaderboard(2, host_region = "us")
mmrCurrent <- ladder_data_current$mmr
head(mmrCurrent)
#> [1] 7355 7229 6970 6813 6813 6805

Notice that for this function, it doesn’t matter which host region we use, as long as the region_id is set to the region from which we would like to retrieve data. That is,

ladder_data_current2 <- get_gm_leaderboard(2, host_region = "kr")
mmrCurrent2 <- ladder_data_current$mmr

identical(mmrCurrent,mmrCurrent2)
#> [1] TRUE

For this reason, it is important to refer to the documentation. If the region_id argument is not available, it is likely that the host_region affects the data retrieved.

We will continue the rest of the vignette using the MMRs of players from the last season.

Visualization

Overall distribution

Since we now have the MMRs of players from season 43, we can plot the distribution as a histogram using the ggplot2 package.

ggplot() +
  geom_histogram(aes(x=mmr)) +
  theme_light()
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Although this is useful, it would also be interesting to look at the distribution by player race.

Distribution by race

First, we will obtain the race of each player in the league. This is accomplished using the sapply function. Then, we create a simple data frame with the MMR and races. Since there are so few random players, we will filter these out from the dataset (at the time of writing there is a single player in the Grandmaster league of Europe).

races <- sapply(ladder_data$team$member, 
                function(x) x$played_race_count[[1]]$race$en_US)
df <- data.frame(mmr,races)
df <- df[df$races!='Random',]
head(df)
#>    mmr   races
#> 1 7464    Zerg
#> 2 7151 Protoss
#> 3 7075  Terran
#> 4 7004  Terran
#> 5 7002  Terran
#> 6 6882    Zerg

Now, we will once again plot a histogram and look at the separate race distributions using ggplot2 with some arguments to make the plot a little more visually appealing.

ggplot(df,aes(x = mmr, group = races, fill = races)) +
  geom_histogram() + 
  scale_fill_manual(values = c('Orange','Blue','Red')) +
  facet_wrap(~races) + 
  theme_light()
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Another way of visualizing these distributions is to plot them on the same graph and use a statistical method to create a smoothed density estimate. The below plot shows a smoothed density estimate using a Gaussian kernel and an adjusted bandwidth.

ggplot(df,aes(x = mmr)) +
  geom_density(aes(y = ..density.., group = races, colour = races), 
               kernel=c("gaussian"),
               adjust = 1.5,
               position = "identity", 
               lwd = 1) +
  scale_color_manual(values = c('Orange','Blue','Red')) +
  theme_light()