How to work with WHONET data

Matthijs S. Berends

23 June 2019

Import of data

This tutorial assumes you already imported the WHONET data with e.g. the readxl package. In RStudio, this can be done using the menu button ‘Import Dataset’ in the tab ‘Environment’. Choose the option ‘From Excel’ and select your exported file. Make sure date fields are imported correctly.

An example syntax could look like this:

library(readxl)
data <- read_excel(path = "path/to/your/file.xlsx")

This package comes with an example data set WHONET. We will use it for this analysis.

Preparation

First, load the relevant packages if you did not yet did this. I use the tidyverse for all of my analyses. All of them. If you don’t know it yet, I suggest you read about it on their website: https://www.tidyverse.org/.

library(dplyr)   # part of tidyverse
library(ggplot2) # part of tidyverse
library(AMR)     # this package

We will have to transform some variables to simplify and automate the analysis:

# transform variables
data <- WHONET %>%
  # get microbial ID based on given organism
  mutate(mo = as.mo(Organism)) %>% 
  # transform everything from "AMP_ND10" to "CIP_EE" to the new `rsi` class
  mutate_at(vars(AMP_ND10:CIP_EE), as.rsi)

No errors or warnings, so all values are transformed succesfully. Let’s check it though, with a couple of frequency tables:

# our newly created `mo` variable
data %>% freq(mo, nmax = 10)

Frequency table of mo from data (500 x 54)

Class: mo (character)
Length: 500 (of which NA: 0 = 0.00%)
Unique: 39

Families: 10
Genera: 17
Species: 38

Item Count Percent Cum. Count Cum. Percent
1 B_ESCHR_COL 245 49.0% 245 49.0%
2 B_STPHY_CNS 74 14.8% 319 63.8%
3 B_STPHY_EPI 38 7.6% 357 71.4%
4 B_STRPT_PNE 31 6.2% 388 77.6%
5 B_STPHY_HOM 21 4.2% 409 81.8%
6 B_PROTS_MIR 9 1.8% 418 83.6%
7 B_ENTRC_IUM 8 1.6% 426 85.2%
8 B_STPHY_CAP 8 1.6% 434 86.8%
9 B_ENTRB_CLO 5 1.0% 439 87.8%
10 B_ENTRC_COL 4 0.8% 443 88.6%

(omitted 29 entries, n = 57 [11.4%])


# our transformed antibiotic columns
# amoxicillin/clavulanic acid (J01CR02) as an example
data %>% freq(AMC_ND2)

Frequency table of AMC_ND2 from data (500 x 54)

Class: factor > ordered > rsi (numeric)
Length: 500 (of which NA: 19 = 3.80%)
Levels: 3: S < I < R
Unique: 3

Drug: Amoxicillin/clavulanic acid (AMC, J01CR02)
Group: Beta-lactams/penicillins
%SI: 78.59%

Item Count Percent Cum. Count Cum. Percent
1 S 356 74.0% 356 74.0%
2 R 103 21.4% 459 95.4%
3 I 22 4.6% 481 100.0%

Analysis

(more will be available soon)