The ggip package provides data visualization of IP addresses and networks. It achieves this by mapping one-dimensional IP data onto a two-dimensional grid.
This introductory vignette gives a quickstart guide on how to use ggip functions to generate plots.
library(ggplot2) library(dplyr) library(ipaddress) library(ggip)
The central component of any ggip plot is the coordinate system, as
coord_ip(). It determines exactly how IP data
is mapped to the 2D grid, and ensures this mapping is applied
consistently across all plotted layers.
coord_ip() function must be called once per plot,
and takes arguments:
||The region of IP space visualized by the entire 2D grid. By default, the entire IPv4 space is visualized. This argument allows you to zoom into a region of IP space or to visualize IPv6 space.|
||The network prefix length corresponding to a pixel in the 2D grid. Increasing this argument effectively improves the resolution of the plot.|
||The path taken across the 2D grid while mapping IP data.|
More details about this mapping are found in
Behind the scenes, ggip searches the plotted data sets for any
columns that are
vectors. For each matching column, it replaces the vector with a data
frame containing both the original IP data and the mapped Cartesian
coordinates. This means the plotted data set now contains a nested data
As an example, consider a data set featuring two columns. The
label column is a character vector and the
address column is an
#> # A tibble: 3 × 2 #> label address #> <chr> <ip_addr> #> 1 A 0.0.0.0 #> 2 B 192.168.0.1 #> 3 C 255.255.255.255
This data set is transformed such that the
column is now a data frame. It contains an
ip column with
ip_address() vector, and
y columns with the Cartesian coordinates on the 2D
#> # A tibble: 3 × 2 #> label address$ip $x $y #> <chr> <ip_addr> <int> <int> #> 1 A 0.0.0.0 0 255 #> 2 B 192.168.0.1 214 142 #> 3 C 255.255.255.255 255 255
These transformed data frame columns are available when specifying
aesthetics. The nested columns can be accessed using the usual
$ syntax (see examples below).
Layers from ggplot2 and other external packages don’t know about the
internal data transformation used by ggip. For this reason, these layers
expect their positional aesthetics (e.g.
y) to be specified explicitly. Fortunately, we can extract
the Cartesian coordinates from our data frame columns using the
As an example, we plot an
ip_address() vector as points
accompanied by labels. Note that we’ve specified the
label aesthetics at the top level of the
plot, and then the
geom_label() layers have picked them up later.
tibble(address = ip_address(c("0.0.0.0", "188.8.131.52", "192.168.0.1"))) %>% ggplot(aes(x = address$x, y = address$y, label = address$ip)) + geom_point() + geom_label(nudge_x = c(10, 0, -10), nudge_y = -10) + coord_ip(expand = TRUE) + theme_ip_light()
Similarly, we plot
ip_network() vectors using layers
corresponding to rectangles.
%>% iana_ipv4 ggplot(aes(xmin = network$xmin, ymin = network$ymin, xmax = network$xmax, ymax = network$ymax)) + geom_rect(aes(fill = allocation)) + scale_fill_brewer(palette = "Accent", name = NULL) + coord_ip() + theme_ip_dark()
Note: There are small gaps between the rectangles because networks are mapped onto a 2D grid (i.e. discrete), whereas ggplot2 visualizes the continuous 2D plane. This can be resolved by adding/subtracting 0.5 to the positional aesthetics. However, this gap is often helpful to distinguish networks.
Layers from ggip do know about the internal data
transformation, so they take an
ip aesthetic corresponding
to the data frame column. They can then automatically extract the
relevant positional information. This is easier because the name of the
data frame column is also the name of the original
ip_network() column in the
input data set.
As an example, we plot a heatmap of an
tibble(address = sample_ipv4(10000)) %>% ggplot(aes(ip = address)) + stat_summary_address() + scale_fill_viridis_c(guide = "none") + coord_ip() + theme_ip_dark()