How to import data from SPSS / SAS / Stata

Matthijs S. Berends

23 June 2019

SPSS / SAS / Stata

SPSS (Statistical Package for the Social Sciences) is probably the most well-known software package for statistical analysis. SPSS is easier to learn than R, because in SPSS you only have to click a menu to run parts of your analysis. Because of its user-friendliness, it is taught at universities and particularly useful for students who are new to statistics. From my experience, I would guess that pretty much all (bio)medical students know it at the time they graduate. SAS and Stata are comparable statistical packages popular in big industries.

Compared to R

As said, SPSS is easier to learn than R. But SPSS, SAS and Stata come with major downsides when comparing it with R:

If you sometimes write syntaxes in SPSS to run a complete analysis or to ‘automate’ some of your work, you should perhaps do this in R. You will notice that writing syntaxes in R is a lot more nifty and clever than in SPSS. Still, as working with any statistical package, you will have to have knowledge about what you are doing (statistically) and what you are willing to accomplish.

To demonstrate the first point:

# not all values are valid MIC values:
as.mic(0.125)
# Class 'mic'
# [1] 0.125
as.mic("testvalue")
# Class 'mic'
# [1] <NA>

# the Gram stain is avaiable for all bacteria:
mo_gramstain("E. coli")
# [1] "Gram-negative"

# Klebsiella is intrinsic resistant to amoxicllin, according to EUCAST:
klebsiella_test <- data.frame(mo = "klebsiella", 
                              amox = "S",
                              stringsAsFactors = FALSE)
klebsiella_test
#           mo amox
# 1 klebsiella    S
eucast_rules(klebsiella_test, info = FALSE)
#           mo amox
# 1 klebsiella    R

# hundreds of trade names can be translated to a name, trade name or an ATC code:
ab_name("floxapen")
# [1] "Flucloxacillin"
ab_tradenames("floxapen")
#  [1] "Floxacillin"          "FLOXACILLIN"          "Floxapen"            
#  [4] "Floxapen sodium salt" "Fluclox"              "Flucloxacilina"      
#  [7] "Flucloxacillin"       "Flucloxacilline"      "Flucloxacillinum"    
# [10] "Fluorochloroxacillin"
ab_atc("floxapen")
# Class 'atc'
# [1] J01CF05

Import data from SPSS/SAS/Stata

RStudio

To work with R, probably the best option is to use RStudio. It is an open-source and free desktop environment which not only allows you to run R code, but also supports project management, version management, package management and convenient import menus to work with other data sources. You can also install RStudio Server on a private or corporate server, which brings nothing less than the complete RStudio software to you as a website (at home or at work).

To import a data file, just click Import Dataset in the Environment tab:

If additional packages are needed, RStudio will ask you if they should be installed on beforehand.

In the the window that opens, you can define all options (parameters) that should be used for import and you’re ready to go:

If you want named variables to be imported as factors so it resembles SPSS more, use as_factor().

The difference is this:

Base R

To import data from SPSS, SAS or Stata, you can use the great haven package yourself:

You can now import files as follows: