Contrasts

In a previous vignette, we introduced the “marginal effect” as a partial derivative. Since derivatives are only properly defined for continuous variables, we cannot use them to interpret the effects of changes in categorical variables. For this, we turn to contrasts between Adjusted predictions. In the context of this package, a “Contrast” is defined as:

The difference between two adjusted predictions, calculated for meaningfully different regressor values (e.g., College graduates vs. Others).

Simple contrasts

Consider a simple model with a logical and a factor variable:

library(marginaleffects)

tmp <- mtcars
tmp$am <- as.logical(tmp$am)
mod <- lm(mpg ~ am + factor(cyl), tmp)

The marginaleffects function automatically computes contrasts for each level of the categorical variables, relative to the baseline category (FALSE for logicals, and the reference level for factors), while holding all other values at their mode or mean:

mfx <- marginaleffects(mod)
summary(mfx)
#> Average marginal effects
#>   Term     Contrast  Effect Std. Error z value   Pr(>|z|)     2.5 % 97.5 %
#> 1   am TRUE - FALSE   2.560      1.298   1.973    0.04851   0.01675  5.103
#> 2  cyl        6 - 4  -6.156      1.536  -4.009 6.1077e-05  -9.16608 -3.146
#> 3  cyl        8 - 4 -10.068      1.452  -6.933 4.1147e-12 -12.91359 -7.222
#>
#> Model type:  lm
#> Prediction type:  response

The summary printed above says that moving from the reference category 4 to the level 6 on the cyl factor variable is associated with a change of -6.156 in the adjusted prediction. Similarly, the contrast from FALSE to TRUE on the am variable is equal to 2.560.

We can obtain the same results using the emmeans package:

library(emmeans)
emm <- emmeans(mod, specs = "cyl")
contrast(emm, method = "revpairwise")
#>  contrast estimate   SE df t.ratio p.value
#>  6 - 4       -6.16 1.54 28  -4.009  0.0012
#>  8 - 4      -10.07 1.45 28  -6.933  <.0001
#>  8 - 6       -3.91 1.47 28  -2.660  0.0331
#>
#> Results are averaged over the levels of: am
#> P value adjustment: tukey method for comparing a family of 3 estimates

emm <- emmeans(mod, specs = "am")
contrast(emm, method = "revpairwise")
#>  contrast     estimate  SE df t.ratio p.value
#>  TRUE - FALSE     2.56 1.3 28   1.973  0.0585
#>
#> Results are averaged over the levels of: cyl

Contrasts with interactions

In models with multiplicative interactions, the contrasts of a categorical variable will depend on the values of the interacted variable:

mod_int <- lm(mpg ~ am * factor(cyl), tmp)

We can now use the newdata argument of the marginaleffects function to compute contrasts for different values of the other regressors. As in the marginal effects vignette, the datagrid function can be handy. Since we only care about the logical am contrast, we use the variables to indicate the subset of results to report:

marginaleffects(mod_int, newdata = datagrid(cyl = tmp\$cyl), variables = "am")
#>   rowid     type term     contrast     dydx std.error    am cyl
#> 1     1 response   am TRUE - FALSE 1.441667  2.315925 FALSE   6
#> 2     2 response   am TRUE - FALSE 5.175000  2.052848 FALSE   4
#> 3     3 response   am TRUE - FALSE 0.350000  2.315925 FALSE   8

Once again, we obtain the same results with emmeans:

emm <- emmeans(mod_int, specs = "am", by = "cyl")
contrast(emm, method = "revpairwise")
#> cyl = 4:
#>  contrast     estimate   SE df t.ratio p.value
#>  TRUE - FALSE     5.17 2.05 26   2.521  0.0182
#>
#> cyl = 6:
#>  contrast     estimate   SE df t.ratio p.value
#>  TRUE - FALSE     1.44 2.32 26   0.623  0.5390
#>
#> cyl = 8:
#>  contrast     estimate   SE df t.ratio p.value
#>  TRUE - FALSE     0.35 2.32 26   0.151  0.8810

Complex queries

As described above, the marginaleffects package includes limited support to compute contrasts. Users who require more powerful features are encouraged to consider alternative packages such as emmeans, modelbased, or ggeffects. These packages offer useful features such as automatic back-transforms, p value correction for multiple comparisons, and more.