In the Introduction to colorist and Tuning visualizations in colorist vignettes, we provided an overview of
colorist functionality and explored how users might adjust visualizations to address specific narrative goals. Here, we focus on several issues that users may confront when making finished visualizations:
coloristmaps using supplementary spatial data
coloristmaps in other R spatial packages and external GIS software
We start by loading the necessary libraries.
library(colorist) library(ggplot2) library(raster) library(rnaturalearth) library(sf) library(tigris)
It’s easy to imagine users wanting to add context to visualizations generated by
colorist, to help explain the causes of particular spatiotemporal distributions. Here, we focus on showing how Natural Earth and US Census TIGER/Line data can be accessed from within R and incorporated into
Natural Earth is a public domain data set that contains vector and raster data developed for display at several different map scales. Files describing physical and cultural geographies can be accessed through R using functions in the
rnaturalearth package. Here, we download Natural Earth vector data and add it to a map describing the spatiotemporal distribution of Field Sparrow, a small bird that occurs in the eastern United States. We begin by calculating distribution metrics for Field Sparrow with
metrics_distill(), choosing a palette with
palette_timecycle(), and creating a map with
map_single(). Note that we choose to save the
ggplot2 map as an object.
# calculate metrics, choose a palette, make a map <- metrics_distill(fiespa_occ) m_fiespa <- palette_timecycle(fiespa_occ) p_fiespa <- map_single(m_fiespa, p_fiespa)map_fiespa
Then, we use the
ne_download() function from
rnaturalearth to download country, river, and lake spatial data as
sf (simple features) objects, which are easy to incorporate into our existing map. In the process, we project and crop each data set to the same projection and extent as our Field Sparrow data. To avoid annoying topology and projection issues, we create our own ocean polygon by calculating the difference between a polygon describing the extent of the Field Sparrow map and the polygon describing countries.
# download, transform, and crop spatial data <- ne_download(category = "cultural", type = "countries", countries returnclass = "sf", scale = 50) %>% st_transform(crs = st_crs(fiespa_occ)) %>% st_crop(st_bbox(fiespa_occ)) <- ne_download(category = "physical", type = "rivers_lake_centerlines", rivers returnclass = "sf", scale = 50) %>% st_transform(crs = st_crs(fiespa_occ)) %>% st_crop(st_bbox(fiespa_occ)) <- ne_download(category = "physical", type = "lakes", returnclass = "sf", lakes scale = 50) %>% st_transform(crs = st_crs(fiespa_occ)) %>% st_crop(st_bbox(fiespa_occ)) # create polygon describing ocean <- st_as_sfc(st_bbox(fiespa_occ)) %>% ocean st_difference(st_union(countries))
We add the supplementary spatial data to the map by appending
geom_sf() elements to the underlying map and symbolizing each element according to our specifications. We also add a bounding rectangle to our map.
# add supplementary spatial data to map <- map_fiespa + map_fiespa_ne geom_sf(data = ocean, fill = "gray95", color = "black", size = 0.25) + geom_sf(data = rivers, size = 0.25, color = "gray35") + geom_sf(data = lakes, fill = "gray95", size = 0.25, color = "gray35") + geom_sf(data = st_as_sfc(st_bbox(fiespa_occ)), fill = NA, color = "black", size = 0.25) # show the map print(map_fiespa_ne)
Now, we have a clearer perspective on the spatiotemporal distribution of Field Sparrow in relation to a variety of easily recognizable geographic features in North America.
Next, we integrate spatial data from the
tigris package into a map describing Fisher use of a territory in upstate New York over the course of nine nights. The
tigris package enables download of TIGER/Line data generated by the United States Census Bureau. Files detail a wide array of geographic features, area delineations, and place names throughout the United States.
In contrast to the previous example, we start by using the
map_multiples() functions to visualize individual raster layers from the Fisher data set. We adjust
map_multiples() to emphasize areas that were used only briefly by the Fisher.
# calculate metrics, choose a palette, make a series of maps <- metrics_pull(fisher_ud) m_fisher <- palette_timeline(9, start_hue = -40) p_fisher <- map_multiples(m_fisher, p_fisher, lambda_i = -5, map_fisher labels = paste("April", 7:15))
To download spatial data using
tigris we need to know that the Fisher study area is located in Rensselaer County, New York. Then, we can access linear water features, areal water features, and roads for the county using
tigris functions, projecting and cropping our supplementary spatial data along the way.
tigris functions provide data formatted as
# download, transform, and crop spatial data <- linear_water("NY", "Rensselaer") %>% streams st_transform(crs = st_crs(fisher_ud)) %>% st_crop(st_bbox(fisher_ud)) <- area_water("NY", "Rensselaer") %>% ponds st_transform(crs = st_crs(fisher_ud)) %>% st_crop(st_bbox(fisher_ud)) <- roads("NY", "Rensselaer") %>% roads st_transform(crs = st_crs(fisher_ud)) %>% st_crop(st_bbox(fisher_ud))
As in the previous example, we add our spatial data to the underlying
colorist map using
geom_sf() elements and we add an unobtrusive bounding box to more clearly define the facets of our plot.
# add supplementary spatial data to the series of maps + map_fisher geom_sf(data = streams, linetype = 6, color = "lightblue4", size = 0.25) + geom_sf(data = ponds, linetype = 6, color = "lightblue4", fill = "lightblue", size = 0.25) + geom_sf(data = roads, size = 0.25, color = alpha("black", 0.5)) + geom_sf(data = st_as_sfc(st_bbox(fisher_ud)), fill = NA, color = "gray", size = 0.25)
Now we can clearly see how the Fisher moved throughout its territory across nights and in relation to the locations of roads, streams, and ponds.
We recommend trying to keep a “light touch” with edits to
colorist maps to ensure that information about spatiotemporal distributions remains the focus. In addition, it’s important to keep in mind that
colorist visualizations rely on differences in cell opacity to communicate information about differences in intensity; changing the plot background color from white will make the map difficult to interpret.
The legend functions in
ggplot2 plot objects, some of which are fairly complicated. Here, we focus on modifying legends to aid interpretation and combining legends and maps within a
When we make a legend for the Field Sparrow annual cycle map described above, the default settings do not communicate the meanings of intensity, specificity, or layer very clearly.
# create a legend using default settings legend_timecycle(p_fiespa)
Remember, we have distilled distribution metrics from a stack of monthly occurrence probabilities, so intensity values represent maximum occurrence throughout the year, specificity values indicate the degree of seasonality in occurrences, and layer values reveal the month of maximum occurrence. We can edit the legend to clarify these interpretations by making several changes in the
# change labels on legend <- legend_timecycle(p_fiespa, l_fiespa origin_label = "Jan 1", # specificity labels label_s = c("Low\nseasonality", "Moderate\nseasonality", "High\nseasonality"), # intensity label label_i = "Peak occurrence", # layers label label_l = "Month of peak occurrence") # show legend print(l_fiespa)
Now, we need to decide where to position the legend in relation to the map describing the spatiotemporal distribution of Field Sparrow. After playing with several ideas, we place the legend beneath the main map. This requires expanding the spatial extent of the plot using the
coord_sf() function and then inserting the legend into the “empty landscape” using
annotation_custom(). Note that the values in
annotation_custom() are in projected map coordinates.
# position legend below map + map_fiespa_ne coord_sf(ylim = c(-3153281, 1405830)) + annotation_custom(ggplotGrob(l_fiespa), xmin = -1482551, xmax = 1850606, ymin = -3153281, ymax = -1453281)
The strategy for editing details of
legend_timeline() plots is similar to that for
legend_timecycle() plots, despite differences in their layouts. Before changing the legend, however, we should use
map_single() to describe Fisher use of its territory over nine nights in a single map. Because we’re making one larger map, we’re also going to add another layer of spatial data describing building footprints in the area. These features come from the OpenStreeMap Microsoft Building Footprint dataset, a cropped and projected copy of which is available on the GitHub repository for this package.
# calculate distribution metrics <- metrics_distill(fisher_ud) m_fisher_distill # download building footprints <- file.path(tempdir(), "buildings.rds") f_buildings download.file(paste0("https://github.com/mstrimas/colorist/raw/master/", "data-raw/buildings.rds"), f_buildings)<- readRDS(f_buildings) buildings unlink(f_buildings) # make a map <- map_single(m_fisher_distill, p_fisher, map_fisher_distill lambda_i = -5, lambda_s = 10) + geom_sf(data = streams, linetype = 6, color = "lightblue4", size = 0.25) + geom_sf(data = ponds, linetype = 6, color = "lightblue4", fill = "lightblue", size = 0.25) + geom_sf(data = roads, size = 0.25, color = alpha("black", 0.5)) + geom_sf(data = buildings, size = 0.25, color = alpha("black", 0.5)) + geom_sf(data = st_as_sfc(st_bbox(fisher_ud)), fill = NA, color = "black", size = 0.25) # show the map print(map_fisher_distill)
lambda_i = -5 makes rarely used parts of the landscape more conspicuous and setting
lambda_s = 10 emphasizes differences in how consistently areas were used across nights. Spatial data for Rensselaer County are added to the map using
geom_sf() elements, just as in the previous section.
Now, let us generate a default legend for our Fisher data using
# create a legend using default settings legend_timeline(p_fisher)
To make the legend more helpful as a guide to interpreting the spatiotemporal distribution of the Fisher, we need to edit intensity, specificity, and layer labels. So, we modify
legend_timeline() to reflect their meanings with respect to the Fisher data. We also prescribe how the ends of the timeline are labeled using the
# change labels of legend <- legend_timeline(p_fisher, l_fisher time_labels = c("Apr 7", "Apr 15"), # intensity label label_i = "Peak use", # layer label label_l = "Night of peak use", # specificity labels label_s = c("Consistent use", "Occasional use", "Ephemeral use")) # show legend print(l_fisher)
To place our legend next to the map, we expand the extent of our plot by modifying
coord_sf() and we specify its position using the
# position legend to the left of the map + map_fisher_distill coord_sf(xlim = c(-4300, 2150)) + annotation_custom(ggplotGrob(l_fisher), xmin = -4400, xmax = -2500, ymin = 5100266, ymax = 5104666)
Here, we return to data describing the utilization distributions of two African Elephants in Etosha National Park in 2011. The two layers of the RasterStack form an unordered set of distributions, one for each individual. To visualize how the elephants partitioned space, we begin by calculating distribution metrics using
metrics_distill(), creating a palette with
palette_set(), and generating a map with
map_single(). We add visual emphasis to rarely used areas via the
# calculate metrics, choose a palette, make a series of maps <- metrics_distill(elephant_ud) m_elephants <- palette_set(elephant_ud) p_elephants <- map_single(m_elephants, p_elephants, lambda_i = -5)map_elephants
This should all seem familiar.
Next, we build a legend using our palette and the default settings in
# create a legend using default settings legend_set(p_elephants)
The legend is meant to tell us how the two individuals partitioned space in Etosha National Park over the course of a year, allowing viewers to understand locations that were used predominantly by one individual, or the other, and locations that were used equally by both individuals.
To help communicate this information more clearly, we should make some adjustments to the legend. We can label the two facets with individual identifiers using the
group_labels argument in
legend_set() and we can modify
axis_s values to orient viewers.
# change labels of legend <- legend_set(p_elephants, l_elephants group_labels = names(elephant_ud), # intensity label label_i = "Intensity of use", # specificity label label_s = "Share of use", # specificty axis axis_s = c("equal", "sole")) # show legend print(l_elephants)
Now, we have a legend that further reveals the narrative of space use by LA11 and LA14 in 2011 and we can position it within the map using
# position legend within map + map_elephants annotation_custom(ggplotGrob(l_elephants), xmin = -3000, xmax = 37000, ymin = -2347500, ymax = -2305000)
coloristmaps in other software
ggplot2 to construct visualizations because it is an incredibly flexible visualization tool that enables maps and legends to be amended and adjusted in many different ways. However, some users may want to employ other tools for visualizing
colorist outputs, because of their greater familiarity with those tools or because of specific features that are not available in
Many of the R packages that enable plotting of spatial data (and most external GIS software) rely on RGBa descriptions of color and opacity instead of the hexadecimal-alpha descriptions used in
colorist functions. To accommodate display of
colorist maps in these other tools, the
map_single() function can be used to return a
RasterStack object containing layers of red, green, blue, and alpha information. Users can then visualize those data in a variety of other tools.
raster package is a workhorse for handling and analyzing gridded spatial data in R. Here, we return to our Field Sparrow data, but instead of accepting the default
colorist map output, we specify
return_type = "stack" in the
map_single() function. This change directs
map_single() to return an RGBa stack instead of a
ggplot2 plot object.
# ask for map as a raster stack <- map_single(m_fiespa, p_fiespa, return_type = "stack") s_fiesp # show stack information s_fiesp
RasterStack describing our map in RGBa values, we can now opt to visualize the data using plot functions available in the
raster package. The standard
raster::plot() function illustrates each layer of the RasterStack in a standard color palette. In addition to the R, G, B, and alpha layers, note the last layer (i.e.,
n_layers) which describes, in every cell, the number of layers in the original
fiespa_occ data set containing non-NA occurrence values.
# plot layers using raster plot function plot(s_fiesp)
To reconstruct our
colorist map with the appropriate color scheme, we visualize the stack using
raster::plotRGB(), which requires specification of R, G, and B layers. It also requires specification of alpha values as a vector. If all has gone well, the resulting plot should look the same as the
ggplot2 plot generated within
# plot r, g, b, and alpha layers to reconstruct colorist map plotRGB(s_fiesp, 1, 2, 3, alpha = as.vector(s_fiesp[["alpha"]]))
You can then add supplementary spatial data to your raster using other plot functions within
raster and related packages.
There are situations in which you may want to use external GIS software instead of R to create a refined cartographic visualization of
colorist output. Here, we show how to recreate
colorist maps in QGIS, a powerful and free GIS.
Here, we reuse metrics and palette information that we have already generated for two African Elephants to create a map. Instead of accepting the default settings in the
map_single() function, however, we specify
return_type = "stack" and save the output as a
# ask for map as a RasterStack object <- map_single(m_elephants, p_elephants, lambda_i = -5, s_elephants return_type = "stack") # save RasterStack to R working directory writeRaster(s_elephants, "elephants_rgba.grd")
Then, we drag the resulting file into an empty QGIS workspace to begin the visualization process.
The default symbology assigned to the multiband raster by QGIS makes it appear as if our map were created using a box of pastels. To correct the appearance of the map colors, double-click on the elephant_rgba layer to open Layer Properties and ensure that data are being rendered in multiband color, RGB bands are in the correct order, and min and max values for all three bands are 0 and 255, respectively. In addition, be sure that band 4 (i.e., alpha) is specified as the transparency band in the Transparency tab of Layer Properties.
After clicking Apply, the thumbnail at the base of the layer properties interface should correctly reproduce the map we were able to generate in R. Click OK. Avoid making further changes to the symbology of elephant_rgba, otherwise map colors will not match those generated in the
colorist palette and the map will become difficult to interpret. Now, users can add and symbolize spatial data using sophisticated cartographic tools available within QGIS.