Parallel tile rendering

Introduction

For some functionality one single core plumber instance might not be enough to achieve the performance that is desired. One way around this is to use some cluster orchestration tools. The example below discusses the usage of docker-compose but other tools like docker swarm or kubernetes should be able to achieve similar results. Alternatively local parallelization can be used.

Local parallelizing

We first start two tile servers running on port 4001 and 4002.

require(callr)
#> Loading required package: callr
rp_list <- lapply(lapply(as.list(4000+1:2), c, list(tmpGridFile=tmpGridFile)), r_bg, func=function(port, tmpGridFile) {
  # read a stars grid
  weatherData <- stars::read_stars(tmpGridFile, proxy = FALSE, sub = "t")
  names(weatherData) <- "t"
  sf::st_crs(weatherData) <- "+proj=longlat"
  colorFunction <- leaflet::colorNumeric("viridis", c(250, 310))
  colorFunctionWithAlpa <- function(x, alpha = 1) {
    paste0(colorFunction(x), as.character(as.raw(
      as.numeric(alpha) * 255
    )))
  }
  starsTileServer::starsTileServer$new(weatherData, colorFun = colorFunctionWithAlpa)$run(port = port)
})

Now we can use the subdomains argument of addTiles to address both servers.

require(leaflet)
#> Loading required package: leaflet
require(leaflet.extras)
#> Loading required package: leaflet.extras
map <- leaflet() %>%
  addTiles() %>%
  enableTileCaching() %>%
  addTiles(
    "http://127.0.0.1:400{s}/map/t/{z}/{x}/{y}?level=900&time=2000-04-27 01:00:00&alpha=0.5",
    options = tileOptions(useCache = TRUE, crossOrigin = TRUE, subdomains = '12')
  ) %>%
  setView(zoom = 3, lat = 30, lng = 30)

This map looks as follows:

map

Using lapply we can close both servers.

lapply(rp_list, function(x)x$read_output())
#> [[1]]
#> [1] "t, \n"
#> 
#> [[2]]
#> [1] "t, \n"
lapply(rp_list, function(x)x$finalize())
#> [[1]]
#> NULL
#> 
#> [[2]]
#> NULL

Using docker

An alternative approach is to use docker (or some similar functionality). This allows you to scale much broader and is probably an approach that is more suitable for large scale permanent deployments.

Building a docker image

The first step is to build a docker image that can be used to set up the service. This docker image runs the tileserver. A simple example of a possible Dockerfile could look as follows.

FROM rocker/geospatial 
MAINTAINER Bart 
RUN install2.r -n 5 plumber stars; \
    rm -rf /tmp/downloaded_packages
RUN R --quiet -e 'install.packages("starsdata", repos = "http://pebesma.staff.ifgi.de", type = "source")'
RUN R --quiet   -e "remotes::install_gitlab('bartk/starsTileServer')"
EXPOSE 3436
COPY script.R script.R
RUN R --quiet   -e "source('script.R')"
ENTRYPOINT ["R", "--quiet", "-e", "server<-readRDS('server.rds') ;server$run( port=3436, host='0.0.0.0', swagger=T)"]

The following R script is used (script.R):

require(stars)
require(starsTileServer)
s5p <- system.file(
  "sentinel5p/S5P_NRTI_L2__NO2____20180717T120113_20180717T120613_03932_01_010002_20180717T125231.nc",
  package = "starsdata"
)
nit <- read_stars(
  s5p,
  along = NA,
  sub = c(
    "//PRODUCT/SUPPORT_DATA/DETAILED_RESULTS/nitrogendioxide_total_column",
    "//PRODUCT/SUPPORT_DATA/DETAILED_RESULTS/nitrogendioxide_total_column_precision",
    "//PRODUCT/SUPPORT_DATA/DETAILED_RESULTS/nitrogendioxide_total_column_precision_kernel",
    "//PRODUCT/SUPPORT_DATA/DETAILED_RESULTS/number_of_iterations",
    "//PRODUCT/SUPPORT_DATA/DETAILED_RESULTS/number_of_spectral_points_in_retrieval",
    "//PRODUCT/SUPPORT_DATA/DETAILED_RESULTS/oxygen_oxygen_dimer_slant_column_density",
    "//PRODUCT/SUPPORT_DATA/DETAILED_RESULTS/oxygen_oxygen_dimer_slant_column_density_precision",
    "//PRODUCT/SUPPORT_DATA/DETAILED_RESULTS/ozone_slant_column_density",
    "//PRODUCT/SUPPORT_DATA/DETAILED_RESULTS/ozone_slant_column_density_precision",
    "//PRODUCT/SUPPORT_DATA/DETAILED_RESULTS/processing_quality_flags",
    "//PRODUCT/SUPPORT_DATA/DETAILED_RESULTS/ring_coefficient",
    "//PRODUCT/SUPPORT_DATA/DETAILED_RESULTS/ring_coefficient_precision"
  ),
  curvilinear = c("//PRODUCT/longitude", "//PRODUCT/latitude"),
  driver = NULL
)
names(nit) <-
  sub("//PRODUCT/SUPPORT_DATA/DETAILED_RESULTS/", "", names(nit))
for (i in seq(length(names(nit)))) {
  nit[[i]][nit[[i]] > 9e+36] <- NA
}
st_crs(nit) <- 4326

server <- starsTileServer$new(nit)
# we save the server here as there should only be one version (sampling of color scales would otherwise result in differently colored tiles)
saveRDS(server, "server.rds")

Copies of these files can be found with the following commands:

system.file("compose/Dockerfile", package = "starsTileServer")
#> [1] "/tmp/RtmpyORup4/Rinst2485e61d4de9f/starsTileServer/compose/Dockerfile"
system.file("compose/script.R", package = "starsTileServer")
#> [1] "/tmp/RtmpyORup4/Rinst2485e61d4de9f/starsTileServer/compose/script.R"

Running multiple instances

With the following docker-compose.yml file we can then start the applications:

version: "2.2"
services:
  tileserver:
    build: 
      dockerfile: Dockerfile
      context: .
    scale: 4
    restart: always
  lb:
    container_name: haproxy_tile_loadbalancing
    image: 'dockercloud/haproxy:latest'
    environment:
     - TIMEOUT=connect 4000, client 153000, server 230000
    links:
      - tileserver
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
  varnish:
    image: wodby/varnish
    container_name: varnish_tile_caching
    ports: 
     - "80:80"
     - "6081:6081"
     - "8080:8080"
    depends_on:
    - lb
    environment:
      VARNISH_IMPORT_MODULES: cookie,header
      VARNISH_CONFIG_PRESET: drupal
      VARNISH_BACKEND_HOST: lb
      VARNISH_BACKEND_PORT: 80

In this case 4 parallel instances are started. We use haproxy to distribute the load across the containers and varnish to cache the tiles that have been rendered before. The caching makes sure no double work is done.

With the docker-compose build command the required docker containers can be build. Using docker-compose up the cluster can then be started. Now in normal R we can plot a leaflet maps as was done before.

require(leaflet)
leaflet() %>%
  addTiles() %>%
  fitBounds(0, 30, 20, 40) %>%
  addTiles(urlTemplate = "http://127.0.0.1:6081/map/nitrogendioxide_total_column/{z}/{x}/{y}?alpha=.4")