Accumulated local effects (ALE) was developed by Daniel Apley and Jingyu
Zhu as a global explanation approach for interpretable machine
learning (IML). However, the `ale`

package aims to extend it for statistical inference, among other
extensions. This vignette presents the initial effort at extending ALE
for statistical inference. In particular, we present some effect size
measures specific to ALE. We introduce these statistics in detail in a
working paper: Okoli, Chitu. 2023. “Statistical Inference Using Machine
Learning and Classical Techniques Based on Accumulated Local Effects
(ALE).” arXiv. https://doi.org/10.48550/arXiv.2310.09877. Please note
that they might be further refined after peer review.

We begin by loading the necessary libraries.

```
library(mgcv) # for datasets and the gam function
#> Loading required package: nlme
#>
#> Attaching package: 'nlme'
#> The following object is masked from 'package:dplyr':
#>
#> collapse
#> This is mgcv 1.9-1. For overview type 'help("mgcv-package")'.
library(dplyr) # for data manipulation
library(ale)
```

We will demonstrate ALE statistics using a dataset composed and
transformed from the `mgcv`

package. This package is required
to create the generalized additive model (GAM) that we will use for this
demonstration. (Strictly speaking, the source datasets are in the
`nlme`

package, which is loaded automatically when we load
the `mgcv`

package.) Here is the code to generate the data
that we will work with:

```
# Create and prepare the data
# Specific seed chosen to illustrate the spuriousness of the random variable
set.seed(6)
math <-
# Start with math achievement scores per student
MathAchieve |>
as_tibble() |>
mutate(
school = School |> as.character() |> as.integer(),
minority = Minority == 'Yes',
female = Sex == 'Female'
) |>
# summarize the scores to give per-school values
summarize(
.by = school,
minority_ratio = mean(minority),
female_ratio = mean(female),
math_avg = mean(MathAch),
) |>
# merge the summarized student data with the school data
inner_join(
MathAchSchool |>
mutate(school = School |> as.character() |> as.integer()),
by = c('school' = 'school')
) |>
mutate(
public = Sector == 'Public',
high_minority = HIMINTY == 1,
) |>
select(-School, -Sector, -HIMINTY) |>
rename(
size = Size,
academic_ratio = PRACAD,
discrim = DISCLIM,
mean_ses = MEANSES,
) |>
# Remove ID column for analysis
select(-school) |>
select(
math_avg, size, public, academic_ratio,
female_ratio, mean_ses, minority_ratio, high_minority, discrim,
everything()
) |>
mutate(
rand_norm = rnorm(nrow(MathAchSchool))
)
glimpse(math)
#> Rows: 160
#> Columns: 10
#> $ math_avg <dbl> 9.715447, 13.510800, 7.635958, 16.255500, 13.177687, 11…
#> $ size <dbl> 842, 1855, 1719, 716, 455, 1430, 2400, 899, 185, 1672, …
#> $ public <lgl> TRUE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, FALSE, FALS…
#> $ academic_ratio <dbl> 0.35, 0.27, 0.32, 0.96, 0.95, 0.25, 0.50, 0.96, 1.00, 0…
#> $ female_ratio <dbl> 0.5957447, 0.4400000, 0.6458333, 0.0000000, 1.0000000, …
#> $ mean_ses <dbl> -0.428, 0.128, -0.420, 0.534, 0.351, -0.014, -0.007, 0.…
#> $ minority_ratio <dbl> 0.08510638, 0.12000000, 0.97916667, 0.40000000, 0.72916…
#> $ high_minority <lgl> FALSE, FALSE, TRUE, FALSE, TRUE, FALSE, FALSE, FALSE, F…
#> $ discrim <dbl> 1.597, 0.174, -0.137, -0.622, -1.694, 1.535, 2.016, -0.…
#> $ rand_norm <dbl> 0.21951995, 0.11687621, -0.51209711, 1.00068833, -2.580…
```

The structure has 160 rows, each of which refers to a school whose
students have taken a mathematics achievement test. We describe the data
here based on documentation from the `nlme`

package but many
details are not quite clear:

variable | format | description |
---|---|---|

math_avg | double | average mathematics achievement scores of all students in the school |

size | double | the number of students in the school |

public | logical | TRUE if the school is in the public sector; FALSE if in the Catholic sector |

academic_ratio | double | the percentage of students on the academic track |

female_ratio | double | percentage of students in the school that are female |

mean_ses | double | mean socioeconomic status for the students in the school (measurement is not quite clear) |

minority_ratio | double | percentage of students that are members of a minority racial group |

high_minority | logical | TRUE if the school has a high ratio of students of minority racial groups (unclear, but perhaps relative to the location of the school) |

discrim | double | the “discrimination climate” (perhaps an indication of extent of racial discrimination in the school?) |

rand_norm | double | a completely random variable |

Of particular note is the variable `rand_norm`

. We have
added this completely random variable (with a normal distribution) to
demonstrate what randomness looks like in our analysis. (However, we
selected the specific random seed of 6 because it highlights some
particularly interesting points.)

The outcome variable that is the focus of our analysis is
`math_avg`

, the average mathematics achievement scores of all
students in each school. Here are its descriptive statistics:

Before starting, we recommend that you enable progress bars to see how long procedures will take. Simply run the following code at the beginning of your R session:

```
# Run this in an R console; it will not work directly within an R Markdown or Quarto block
progressr::handlers(global = TRUE)
progressr::handlers('cli')
```

If you forget to do that, the `{ale}`

package will do it
automatically for you with a notification message.

Now we create a model and compute statistics on it. Because this is a
relatively small dataset, we will
carry out full model bootstrapping using the
`model_bootstrap()`

function.

First, we create a generalized additive model (GAM) so that we can capture non-linear relationships in the data.

```
gam_math <- gam(
math_avg ~ public + high_minority +
s(size) + s(academic_ratio) + s(female_ratio) + s(mean_ses) +
s(minority_ratio) + s(discrim) + s(rand_norm),
data = math
)
gam_math
#>
#> Family: gaussian
#> Link function: identity
#>
#> Formula:
#> math_avg ~ public + high_minority + s(size) + s(academic_ratio) +
#> s(female_ratio) + s(mean_ses) + s(minority_ratio) + s(discrim) +
#> s(rand_norm)
#>
#> Estimated degrees of freedom:
#> 1.00 5.53 2.81 2.93 5.15 1.00 1.00
#> total = 22.41
#>
#> GCV score: 2.238298
```

Before we bootstrap the model to create ALE and other data, there is an important preliminary step when our goal is to analyze ALE statistics. As with any statistics calculated on a dataset, there is some randomness to the statistic values that the procedure will give us. To quantify this randomness, we want to obtain p-values for these statistics. A p-value is a number from 0 to 1 that indicates the probability that a given statistic value would occur by random chance. So, high p-values mean that the statistic value is likely to be random whereas low p-values (typically lower than 0.05 by convention) mean that the statistic probably represents a reliable value, not obtained merely by chance. (This is not to be confused with effect sizes, which we come to later in this article.)

P-values for most standard statistics are based on the assumption
that these statistics fit some distribution or another (e.g., Student’s
*t*, \(\chi^2\), etc.). With
these distributional assumptions, p-values can be calculated very
quickly. However, a key characteristic of ALE is that there are no
distributional assumptions: ALE data is a description of a model’s
characterization of the data given to it. Accordingly, ALE statistics do
not assume any distribution, either.

The implication of this for p-values is that the distribution of the data must be discovered by simulation rather than calculated based on distributional assumptions. The procedure for calculating p-values is the following:

- A random variable is added to the dataset.
- The model is retrained with all variables including this new random variable.
- ALE statistics are calculated for the random variable.
- This procedure is repeated 1,000 times or so to get 1,000 statistic values from the 1,000 random variables.
- p-values are calculated based on the frequency of times that the random variables obtain specific statistic values.

As you can imagine, this procedure is very slow: it involves retraining the entire model on the full dataset 1,000 times. The {ale} package speeds of the process significantly through parallel processing (implemented by default), but it still involves the speed of retraining the model hundreds of times.

To avoid having to repeat this procedure several times (as would be
the case when you are doing exploratory analyses), the
`create_p_funs()`

function generates a `p_funs`

object that can be run once for a given model-dataset pair. The
`p_funs`

object contains functions that can generate p-values
based on statistics for any variable from the same model-dataset pair.
It generates these p-values when passed to the `ale()`

or
`model_bootstrap()`

functions. For very large datasets, the
process of generating the `p_funs`

object could be sped up by
only using a subset of the data and by running fewer than 1,000 random
iterations by setting the `rand_it`

argument. However, the
`create_p_funs()`

function will not allow fewer than 100
iterations, otherwise the p-values thus generated would be
meaningless.)

We now demonstrate how to create the `p_funs`

object for
our case.

```
# # To generate the code, uncomment the following lines.
# # But it is slow because it retrains the model 1000 times,
# # so this vignette loads a pre-created p-values object.
# gam_math_p_funs <- create_p_funs(
# math,
# gam_math
# )
# saveRDS(gam_math_p_funs, file.choose())
gam_math_p_funs <- url('https://github.com/tripartio/ale/raw/main/download/gam_math_p_funs.rds') |>
readRDS()
```

We can now proceed to bootstrap the model for ALE analysis.

By default, `model_bootstrap()`

runs 100 bootstrap
iterations; this can be controlled with the `boot_it`

argument. Bootstrapping is usually rather slow, even on small datasets,
since the entire process is repeated that many times. The
`model_bootstrap()`

function speeds of the process
significantly through parallel processing (implemented by default), but
it still involves retraining the entire model dozens of times. The
default of 100 should be sufficiently stable for model building, when
you would want to run the bootstrapped algorithm several times and you
do not want it to be too slow each time. For definitive conclusions, you
could run 1,000 bootstraps or more to confirm the results of 100
bootstraps.

```
mb_gam_math <- model_bootstrap(
math,
gam_math,
# Pass the p_funs object so that p-values will be generated
ale_options = list(p_values = gam_math_p_funs),
# For the GAM model coefficients, show details of all variables, parametric or not
tidy_options = list(parametric = TRUE),
# tidy_options = list(parametric = NULL),
boot_it = 40, # 100 by default but reduced here for a faster demonstration
parallel = 2 # CRAN limit (delete this line on your own computer for faster speed)
)
```

We can see the bootstrapped values of various overall model
statistics by printing the `model_stats`

element of the model
bootstrap object:

```
mb_gam_math$model_stats
#> # A tibble: 5 × 7
#> name estimate conf.low mean median conf.high sd
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 df 37.3 25.8 37.3 36.5 46.8 7.31
#> 2 df.residual 123. 113. 123. 123. 134. 7.31
#> 3 nobs 160 160 160 160 160 0
#> 4 adj.r.squared 0.879 0.825 0.879 0.886 0.916 0.0292
#> 5 npar 66 66 66 66 66 0
```

The names of the columns follow the `broom`

package
conventions:

`name`

is the specific overall model statistic described in the row.`estimate`

is the bootstrapped estimate of the statistic. It is the same as the bootstrap`mean`

by default, though it can be set to the`median`

with the`boot_centre`

argument of`model_bootstrap()`

. Regardless, both the`mean`

and`median`

estimates are always returned. The`estimate`

column is provided for convenience since that is a standard name in the`broom`

package.`conf.low`

and`conf.high`

are the lower and upper confidence intervals respectively.`model_bootstrap()`

defaults to a 95% confidence interval; this can be changed by setting the`boot_alpha`

argument (the default is 0.05 for a 95% confidence interval).`sd`

is the standard deviation of the bootstrapped estimate.

Our focus, however, in this vignette is on the effects of individual
variables. These are available in the `model_coefs`

element
of the model bootstrap object:

```
mb_gam_math$model_coefs
#> # A tibble: 3 × 7
#> term estimate conf.low mean median conf.high std.error
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 12.8 12.2 12.8 12.7 13.5 0.361
#> 2 publicTRUE -0.728 -1.55 -0.728 -0.731 0.0843 0.446
#> 3 high_minorityTRUE 1.21 0.0412 1.21 1.22 2.30 0.631
```

In this vignette, we cannot go into the details of how GAM models
work (you can learn more with Noam Ross’s excellent tutorial). However,
for our model illustration here, the estimates for the parametric
variables (the non-numeric ones in our model) are interpreted as regular
statistical regression coefficients whereas the estimates for the
non-parametric smoothed variables (those whose variable names are
encapsulated by the smooth `s()`

function) are actually
estimates for expected degrees of freedom (EDF in GAM). The smooth
function `s()`

lets GAM model these numeric variables as
flexible curves that fit the data better than a straight line. The
`estimate`

values for the smooth variables above are not so
straightforward to interpret, but suffice it to say that they are
completely different from regular regression coefficients.

The `ale`

package uses bootstrap-based confidence
intervals, not p-values that assume predetermined distributions, to
determine statistical significance. Although they are not quite as
simple to interpret as counting the number of stars next to a p-value,
they are not that complicated, either. Based on the default 95%
confidence intervals, a coefficient is statistically significant if
`conf.low`

and `conf.high`

are both positive or
both negative. We can filter the results on this criterion:

```
mb_gam_math$model_coefs |>
# filter is TRUE if conf.low and conf.high are both positive or both negative because
# multiplying two numbers of the same sign results in a positive number.
filter((conf.low * conf.high) > 0)
#> # A tibble: 2 × 7
#> term estimate conf.low mean median conf.high std.error
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 12.8 12.2 12.8 12.7 13.5 0.361
#> 2 high_minorityTRUE 1.21 0.0412 1.21 1.22 2.30 0.631
```

The statistical significance of the `estimate`

(EDF) of
the smooth terms is meaningless here because EDF cannot go below 1.0.
Thus, even the random term `s(rand_norm)`

appears to be
“statistically significant”. Only the values for the non-smooth
(parametric terms) `public`

and `high_minority`

should be considered here. So, we find that neither of the coefficient
estimates of `public`

nor of `high_minority`

has
an effect that is statistically significantly different from zero. (The
intercept is not conceptually meaningful here; it is a statistical
artifact.)

This initial analysis highlights two limitations of classical hypothesis-testing analysis. First, it might work suitably well when we use models that have traditional linear regression coefficients. But once we use more advanced models like GAM that flexibly fit the data, we cannot interpret coefficients meaningfully and so it is not so clear how to reach inferential conclusions. Second, a basic challenge with models that are based on the general linear model (including GAM and almost all other statistical analyses) is that their coefficient significance compares the estimates with the null hypothesis that there is no effect. However, even if there is an effect, it might not be practically meaningful. As we will see, ALE-based statistics are explicitly tailored to emphasize practical implications beyond the notion of “statistical significance”.

ALE was developed to graphically display the relationship between predictor variables in a model and the outcome regardless of the nature of the model. Thus, before we proceed to describe our extension of effect size measures based on ALE, let us first briefly examine the ALE plots for each variable.

We can see that most variables seem to have some sort of mean effect across various values. However, for statistical inference, our focus must be on the bootstrap intervals. Crucial to our interpretation is the middle grey band that indicates the median ± 5% of random values. Below, we will explain what exactly the ALE range (ALER) means, but for now, we can say this:

- The approximate middle of the grey band is the median of the y
outcome variables in the dataset (
`math_avg`

, in our case). The middle tick on the right y axis indicates the exact median. (The`model_bootstrap()`

function lets you centre the data on the mean or on zero if you prefer with the`relative_y`

argument.) - We call this grey band the “ALER band”. 95% of random variables had ALE values that fully lay within the ALER band.
- The dashed lines above and below the ALER band expand the boundaries to where 99% of the random variables were constrained. These boundaries could be considered as demarcating an extended or outward ALER band.

The idea is that if the ALE values of any predictor variable falls
fully within the ALER band, then it has no greater effect than 95% of
purely random variables. Moreover, to consider any effect in the ALE
plot to be statistically significant (that is, non-random), there should
be no overlap between the bootstrapped confidence regions of a predictor
variable and the ALER band. (For the threshold p-values, We use the
conventional defaults of 0.05 for 95% confidence and 0.01 for 99%
confidence, but the value can be changed with the `p_alpha`

argument.)

For categorical variables (`public`

and
`high_minority`

above), the confidence interval bars for all
categories overlap the ALER band. The confidence interval bars indicate
two useful pieces of information to us. When we compare them to the ALER
band, their overlap or lack thereof tells us about the practical
significance of the category. When we compare the confidence bars of one
category with those of others, it allows us to assess if the category
has a statistically significant effect that is different from that of
the other categories; this is equivalent to the regular interpretation
of coefficients for GAM and other GLM models. In both cases, the
confidence interval bars of the TRUE and FALSE categories overlap each
other, indicating that there is no statistically significant difference
between categories. Whereas the coefficient table above based on classic
statistics indicated this conclusion for `public`

, it
indicated that `high_minority`

had a statistically
significant effect; our ALE analysis indicates that
`high_minority`

does not. In addition, each confidence
interval band overlaps the ALER band, indicating that none of the
effects is meaningfully different from random results, either.

For numeric variables, the confidence regions overlap the ALER band
for most of the domains of the predictor variables except for some
regions that we will examine. The extreme points of each variable
(except for `discrim`

and `female_ratio`

) are
usually either slightly below or slightly above the ALER band,
indicating that extreme values have the most extreme effects: math
achievement increases with increasing school size, academic track ratio,
and mean socioeconomic status, whereas it decreases with increasing
minority ratio. The ratio of females and the discrimination climate both
overlap the ALER band for the entirety of their domains, so any apparent
trends are not supported by the data.

Of particular interest is the random variable `rand_norm`

,
whose average ALE appears to show some sort of pattern. However, we note
that the 95% confidence intervals we use mean that if we were to retry
the analysis for twenty different random seeds, we would expect at least
one of the random variables to partially escape the bounds of the ALER
band. We will return below to the implications of random variables in
ALE analysis.

Before we continue, let us take a brief detour to see what we get if
we run `model_bootstrap()`

without passing it a
`p_funs`

object. This might be because we forget to do so, or
because we want to see quick results without the slow process of first
generating a `p_funs`

object. Let us run
`model_bootstrap()`

again, but this time, without
p-values.

```
mb_gam_no_p <- model_bootstrap(
math,
gam_math,
# For the GAM model coefficients, show details of all variables, parametric or not
tidy_options = list(parametric = TRUE),
# tidy_options = list(parametric = NULL),
boot_it = 40, # 100 by default but reduced here for a faster demonstration
parallel = 2 # CRAN limit (delete this line on your own computer)
)
mb_gam_no_p$ale$plots |>
patchwork::wrap_plots(ncol = 2)
```

In the absence of p-values, the {ale} packages uses alternate
visualizations to offer meaningful results, but with somewhat different
interpretations of the middle grey band. Without p-values, we do not
have any point of reference for the ALER statistics, so we use
percentiles around the median as the reference. The middle grey band
here indicates the median ± 2.5%, that is, the middle 5% of all average
mathematics achievement scores (`math_avg`

) values in the
dataset. We call this the “median band”. The idea is that if any
predictor can do no better than influencing `math_avg`

to
fall within this middle median band, then it only has a minimal effect.
For an effect to be considered statistically significant, there should
be no overlap between the confidence regions of a predictor variable and
the median band. (We use 5% around the median by default, but the value
can be changed with the `median_band_pct`

argument.) For
further reference, the outer dashed lines indicate the interquartile
range of the outcome values, that is, the 25th and 75th percentiles.

We can see that in this case, the 5% median band is much narrower than the 5% ALER band when p-values are calculated, though they might be more similar for a different dataset. This should give us pause to skipping the calculation of p-values, since we might be overly lax in interpreting apparent relationships as meaningful whereas the ALER band indicates that they might not be that different from what random variables might produce.

For most of the rest of this article, we will only analyze the results with ALER bands generated from p-values, though we will briefly revisit median bands without p-values.

Although ALE plots allow rapid and intuitive conclusions for
statistical inference, it is often helpful to have summary numbers that
quantify the average strengths of the effects of a variable. Thus, we
have developed a collection of effect size measures based on ALE
tailored for intuitive interpretation. To understand the intuition
underlying the various ALE effect size measures, it is useful to first
examine the **ALE effects plot**, which graphically
summarizes the effect sizes of all the variables in the ALE analysis.
This is generated when `ale`

is executed and both statistics
and plots are requested (which is the case by default) and is accessible
with the To focus on all the measures for a specific variable, we can
access the `ale$stats$effects_plot`

element:

This plot is unusual, so it requires some explanation:

- The y (vertical) axis displays the x variables, rather than the x axis. This is consistent with most effect size plots because they list the full names of variables. It is more readable to list them as labels on the y axis than the other way around.
- The x (horizontal) axis thus displays the y (outcome) variable. But there are two representations of this same axis, one at the bottom and one at the top.
- On the bottom is a more typical axis of the outcome variable, in our
case,
`math_avg`

. It is scaled as expected. In our case, the axis breaks default to five units each from 5 to 20, evenly spaced. - On the top, the outcome variable is expressed as percentiles ranging from 0 (the minimum outcome value in the dataset) to 100 (the maximum). It is divided into 10 deciles of 10% each. Because percentiles are usually not evenly distributed in a dataset, the decile breaks are not evenly spaced.
- Thus, this plot has two x axes, the lower one in units of the outcome variable and the upper one in percentiles of the outcome variable. To reduce the confusion, the major vertical gridlines that are slightly darker align with the units of the outcome (lower axis) and the minor vertical gridlines that are slightly lighter align with the percentiles (upper axis).
- The vertical grey band in the middle is the NALED band. Its width is the 0.05 p-value of the NALED (explained below). That is, 95% of random variables had a NALED equal or smaller than that width.
- The variables on the horizontal axis are sorted by decreasing ALED and then NALED value (explained below).

Although it is somewhat confusing to have two axes, the percentiles are a direct transformation of the raw outcome values. The first two base ALE effect size measures below are in units of the outcome variable while their normalized versions are in percentiles of the outcome. Thus, the same plot can display the two kinds of measures simultaneously. Referring to this plot can help understand each of the measures, which we proceed to explain in detail.

Before we explain these measures in detail, we must reiterate the
timeless reminder that correlation is not causation. So, none of the
scores necessarily means that an x variable *causes* a certain
effect on the y outcome; we can only say that the ALE effect size
measures indicate associated or related variations between the two
variables.

The easiest ALE statistic to understand is the ALE range (ALER), so
we begin there. It is simply the range from the minimum to the maximum
of any `ale_y`

value for that variable. Mathematically, that
is

\[\mathrm{ALER}(\mathrm{ale\_y}) = \{
\min(\mathrm{ale\_y}), \max(\mathrm{ale\_y}) \}\]

where \(\mathrm{ale\_y}\) is the vector
of ALE y values for a variable.

All the ALE effect size measures are centred on zero so that they are consistent regardless of if the user chooses to centre their plots on zero, the median, or the mean. Specifically,

`aler_min`

: minimum of any`ale_y`

value for the variable.`aler_max`

: maximum of any`ale_y`

value for the variable.

ALER shows the extreme values of a variable’s effect on the outcome.
In the effects plot above, it is indicated by the extreme ends of the
horizontal bars for each variable. We can access ALE effect size
measures through the `ale$stats`

element of the bootstrap
result object, with multiple views. To focus on all the measures for a
specific variable, we can access the `ale$stats$by_term`

element.

Let’s focus on `public`

. Here is its ALE plot:

Here are the effect size measures for the categorical
`public`

:

```
mb_gam_math$ale$stats$by_term$public
#> # A tibble: 6 × 7
#> statistic estimate p.value conf.low median mean conf.high
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 aled 0.378 0 0.129 0.361 0.378 0.771
#> 2 aler_min -0.342 0.352 -0.747 -0.311 -0.342 -0.111
#> 3 aler_max 0.427 0.241 0.137 0.410 0.427 0.803
#> 4 naled 5.48 0.00100 1.12 5.29 5.48 9.89
#> 5 naler_min -4.90 0.563 -9.66 -5 -4.90 0
#> 6 naler_max 6.24 0.405 1.81 6.21 6.24 13.8
```

We see there that `public`

has an ALER of [-0.34, 0.43].
When we consider that the median math score in the dataset is 12.9, this
ALER indicates that the minimum of any ALE y value for
`public`

(when `public == TRUE`

) is -0.34 below
the median. This is shown at the 12.6 mark in the plot above. The
maximum (`public == FALSE`

) is 0.43 above the median, shown
at the 13.3 point above.

The unit for ALER is the same unit as the outcome variable; in our
case, that is `math_avg`

ranging from 2 to 20. No matter what
the average ALE values might be, the ALER quickly shows the minimum and
maximum effects of any value of the x variable on the y variable.

For contrast, let us look at a numeric variable,
`academic_ratio`

:

Here are its ALE effect size measures:

```
mb_gam_math$ale$stats$by_term$academic_ratio
#> # A tibble: 6 × 7
#> statistic estimate p.value conf.low median mean conf.high
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 aled 0.671 0 0.222 0.650 0.671 1.20
#> 2 aler_min -3.55 0 -7.13 -3.14 -3.55 -0.412
#> 3 aler_max 2.02 0 0.840 2.12 2.02 3.07
#> 4 naled 9.01 0 3.96 8.60 9.01 14.7
#> 5 naler_min -30.1 0 -48.8 -31.6 -30.1 -4.95
#> 6 naler_max 28.0 0 9.23 29.4 28.0 38.8
```

The ALER for `academic_ratio`

is considerably broader with
-3.55 below and 2.02 above the median.

While the ALE range shows the most extreme effects a variable might have on the outcome, the ALE deviation indicates its average effect over its full domain of values. With the zero-centred ALE values, it is conceptually similar to the weighted mean absolute error (MAE) of the ALE y values. Mathematically, it is

\[ \mathrm{ALED}(\mathrm{ale\_y}, \mathrm{ale\_n}) = \frac{\sum_{i=1}^{k} \left| \mathrm{ale\_y}_i \times \mathrm{ale\_n}_i \right|}{\sum_{i=1}^{k} \mathrm{ale\_n}_i} \] where \(i\) is the index of \(k\) ALE x intervals for the variable (for a categorical variable, this is the number of distinct categories), \(\mathrm{ale\_y}_i\) is the ALE y value for the \(i\)th ALE x interval, and \(\mathrm{ale\_n}_i\) is the number of rows of data in the \(i\)th ALE x interval.

Based on its ALED, we can say that the average effect on math scores of whether a school is in the public or Catholic sector is 0.38 (again, out of a range from 2 to 20). In the effects plot above, the ALED is indicated by a white box bounded by parentheses ( and ). As it is centred on the median, we can readily see that the average effect of school sector barely exceeds the limits of the ALER band, indicating that it barely exceeds our threshold of practical relevance. The average effect for ratio of academic track students is slightly higher at 0.67. We can see on the plot that it slightly exceeds the ALER band on both sides, indicating its slightly stronger effect. We will comment on the values of other variables when we discuss the normalized versions of these scores, to which we proceed next.

Since ALER and ALED scores are scaled on the range of y for a given dataset, these scores cannot be compared across datasets. Thus, we present normalized versions of each with intuitive, comparable values. For intuitive interpretation, we normalize the scores on the minimum, median, and maximum of any dataset. In principle, we divide the zero-centred y values in a dataset into two halves: the lower half from the 0th to the 50th percentile (the median) and the upper half from the 50th to the 100th percentile. (Note that the median is included in both halves). With zero-centred ALE y values, all negative and zero values are converted to their percentile score relative to the lower half of the original y values while all positive ALE y values are converted to their percentile score relative to the upper half. (Technically, this percentile assignment is called the empirical cumulative distribution function (ECDF) of each half.) Each half is then divided by two to scale them from 0 to 50 so that together they can represent 100 percentiles. (Note: when a centred ALE y value of exactly 0 occurs, we choose to include the score of zero ALE y in the lower half because it is analogous to the 50th percentile of all values, which more intuitively belongs in the lower half of 100 percentiles.) The transformed maximum ALE y is then scaled as a percentile from 0 to 100%.

There is a notable complication. This normalization smoothly distributes ALE y values when there are many distinct values, but when there are only a few distinct ALE y values, then even a minimal ALE y deviation can have a relatively large percentile difference. If any ALE y value is less than the difference between the median in the data and the value either immediately below or above the median, we consider that it has virtually no effect. Thus, the normalization sets such minimal ALE y values as zero.

Its formula is:

\[
norm\_ale\_y = 100 \times \begin{cases}
0 & \text{if } \max(centred\_y < 0) \leq ale\_y \leq
\min(centred\_y > 0), \\
\frac{-ECDF_{y_{\leq 0}}(ale\_y)}{2} & \text{if }ale\_y < 0 \\
\frac{ECDF_{y_{\geq 0}}(ale\_y)}{2} & \text{if }ale\_y > 0 \\
\end{cases}
\] where - \(centred\_y\) is the
vector of `y`

values centred on the median (that is, the
median is subtracted from all values). - \(ECDF_{y_{\geq 0}}\) is the ECDF of the
non-negative values in `y`

. - \(-ECDF_{y_{\leq 0}}\) is the ECDF of the
negative values in `y`

after they have been inverted
(multiplied by -1).

Of course, the formula could be simplified by multiplying by 50 instead of by 100 and not dividing the ECDFs by two each. But we prefer the form we have given because it is explicit that each ECDF represents only half the percentile range and that the result is scored to 100 percentiles.

Based on this normalization, we first have the normalized ALER (NALER), which scales the minimum and maximum ALE y values from -50% to +50%, centred on 0%, which represents the median:

\[ \mathrm{NALER}(\mathrm{y, ale\_y}) = \{\min(\mathrm{norm\_ale\_y}) + 50, \max(\mathrm{norm\_ale\_y}) + 50 \} \]

where \(y\) is the full vector of y values in the original dataset, required to calculate \(\mathrm{norm\_ale\_y}\).

ALER shows the extreme values of a variable’s effect on the outcome.
In the effects plot above, it is indicated by the extreme ends of the
horizontal bars for each variable. We see there that `public`

has an ALER of -0.34, 0.43. When we consider that the median math score
in the dataset is 12.9, this ALER indicates that the minimum of any ALE
y value for `public`

(when `public == TRUE`

) is
-0.34 below the median. This is shown at the 12.6 mark in the plot
above. The maximum (`public == FALSE`

) is 0.43 above the
median, shown at the 13.3 point above. The ALER for
`academic_ratio`

is considerably broader with -3.55 below and
2.02 above the median.

The result of this transformation is that NALER values can be interpreted as percentile effects of y below or above the median, which is centred at 0%. Their numbers represent the limits of the effect of the x variable with units in percentile scores of y. In the effects plot above, because the percentile scale on the top corresponds exactly to the raw scale below, the NALER limits are represented by exactly the same points as the ALER limits; only the scale changes. The scale for ALER and ALED is the lower scale of the raw outcomes; the scale for NALER and NALED is the upper scale of percentiles.

So, with a NALER of -4.9, 6.24, the minimum of any ALE value for
`public`

(`public == TRUE`

) shifts math scores by
-5 percentile y points whereas the maximum
(`public == FALSE`

) shifts math scores by 6 percentile
points. Academic track ratio has a NALER of -30.08, 27.95, ranging from
-30 to 28 percentile points of math scores.

The normalization of ALED scores applies the same ALED formula as before but on the normalized ALE values instead of on the original ALE y values:

\[ \mathrm{NALED}(y, \mathrm{ale\_y}, \mathrm{ale\_n}) = \mathrm{ALED}(\mathrm{norm\_ale\_y}, \mathrm{ale\_n}) \]

NALED produces a score that ranges from 0 to 100%. It is essentially the ALED expressed in percentiles, that is, the average effect of a variable over its full domain of values. So, the NALED of public school status of 5.5 indicates that its average effect on math scores spans the middle 5.5 percent of scores. Academic ratio has an average effect expressed in NALED of 9% of scores.

When we do not have p-values, the NALED is particularly helpful in comparing the practical relevance of variables against our threshold for the median band by which we consider that a variable needs to shift the outcome on average by more than 5% of the median values. This threshold is the same scale as the NALED. So, we can tell that public school status with its NALED of 5.5 just barely crosses our threshold.

It is particularly striking to note the ALE effect size measures for
the random `rand_norm`

:

```
mb_gam_math$ale$stats$by_term$rand_norm
#> # A tibble: 6 × 7
#> statistic estimate p.value conf.low median mean conf.high
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 aled 0.143 0.182 0.00708 0.111 0.143 0.345
#> 2 aler_min -0.734 0.044 -2.59 -0.316 -0.734 -0.0228
#> 3 aler_max 0.444 0.223 0.0243 0.326 0.444 1.13
#> 4 naled 2.23 0.357 0 1.85 2.23 5.09
#> 5 naler_min -8.27 0.142 -27.6 -5.28 -8.27 0
#> 6 naler_max 6.55 0.354 0 5 6.55 15.7
```

`rand_norm`

has a NALED of 2.2. It might be surprising
that a purely random value has any “effect size” to speak of, but
statistically, it must have some numeric value or the other. However, by
setting our default value for the median band at 5%, we effectively
exclude `rand_norm`

from serious consideration. In informal
tests with several different random seeds, the random variables never
exceeded this 5% threshold. Setting the median band too low at a value
like 1% would not have excluded the random variable, but 5% seems like a
nice balance. Thus, the effect of a variable like the discrimination
climate score (discrim, 3.3) should probably not be considered
practically meaningful.

We realize that 5% as a threshold for the median band is rather arbitrary, inspired by traditional \(\alpha\) = 0.05 for statistical significance and confidence intervals. A proper analysis should use p-values, as most of this article does. However, our initial analyses here show that 5% seems to be an effective choice for excluding a purely random variable from consideration, even for quick initial analyses.

We return to using p-values for the rest of this article.

Here we summarize some general principles for interpreting normalized ALE effect sizes.

**Normalized ALE deviation (NALED):**this is the average variation of ALE effect of the input variable.- Values range from 0 to 100%:
- 0% means no effect at all.
- 100% means the maximum possible effect any variable could have: for a binary variable, one value (50% of the data) sets the outcome at its minimum value and the other value (the other 50% of the data) sets the outcome at its maximum value.

- Larger NALED means stronger effects.

- Values range from 0 to 100%:
**Normalized ALE range (NALER):**these are the minimum and maximum effects of any value of the input variable.- NALER minimum ranges from –50% to 0%; NALER maximum ranges from 0% to +50%:
- 0% means no effect at all. It indicates that the only effect of the input variable is to keep the outcome at the median of its range of values.
- NALER minimum of
*n*means that, regardless of the effect size of NALED, the minimum effect of any input value shifts the outcome by*n*percentile points of the outcome range. Lower values (closer to –50%) mean a stronger extreme effect. - NALER maximum of
*x*means that, regardless of the effect size of NALED, the maximum effect of any input value shifts the outcome by*x*percentile points of the outcome range. Greater values (closer to +50%) mean a stronger extreme effect.

In general, regardless of the values of ALE statistics, we should always visually inspect the ALE plots to identify and interpret patterns of relationships between inputs and the outcome.

A common question for interpreting effect sizes is, “How strong does an effect need to be to be considered ‘strong’ or ‘weak’?” On one hand, we refuse to offer general guidelines for how “strong” is “strong”. The simple answer is that it depends entirely on the applied context. It is not meaningful to try to propose numerical values for statistics that are supposed to be useful for all applied contexts.

On the other hand, we do consider it very important to delineate the threshold between random effects and non-random effects. It is always important to distinguish between a weak but real effect from one that is just a statistical artifact due to random chance. For that, we can offer some general guidelines based on whether or not we have p-values.

When we have p-values for ALE statistics, then the boundaries of ALER
should generally be used to determine the acceptable risk of considering
a statistic to be meaningful. Statistically significant ALE effects are
those that are less than the 0.05 p-value ALER minimum of a random
variable and greater than the 0.05 p-value maximum of a random variable.
As we explained above when introducing the ALER band, this is precisely
what the `{ale}`

package does, especially in the plots that
highlight the ALER band and the confidence region tables that use the
specified ALER p-value threshold.

In the absence of p-values, we suggest that NALED can be a general guide for non-random values. In our informal tests, we find that NALED values below 5% have the same average effect as a random variable. That is, the average effect is not reliable; it might be random. However, regardless of the average effect indicated by NALED, large NALER effects indicate that the ALE plot should be inspected to interpret the exceptional cases. This caveat is very important; unlike GLM coefficients, ALE analysis is sensitive to exceptions to the overall trend. This is precisely what makes it valuable for detecting non-linear effects.

In general, if NALED < 5%, NALER minimum > –5%, and NALER maximum < +5%, the input variable has no meaningful effect. All other cases are worth inspecting the ALE plots for careful interpretation: - NALED > 5% means a meaningful average effect. - NALER minimum < –5% means that there might be at least one input value that significantly lowers the outcome values. - NALER maximum > +5% means that there might be at least one input value that significantly increases the outcome values.

Although effect sizes are valuable in summarizing the global effects of each variable, they mask much nuance since each variable varies in its effect along its domain of values. Thus, ALE is particularly powerful in its ability to make fine-grained inferences of a variable’s effect depending on its specific value.

To understand how bootstrapped ALE can be used for statistical
inference, we must understand the structure of ALE data. Let’s begin
simple with a binary variable with just two categories,
`public`

:

```
mb_gam_math$ale$data$public
#> # A tibble: 2 × 7
#> ale_x ale_n ale_y ale_y_lo ale_y_mean ale_y_median ale_y_hi
#> <ord> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 FALSE 70 13.4 12.5 13.4 13.5 14.1
#> 2 TRUE 90 12.7 11.8 12.7 12.7 13.1
```

Here is the meaning of each column of `ale$data`

for a
categorical variable:

`ale_x`

: the different categories that exist in the categorical variable.`ale_n`

: the number of rows for that category in the dataset provided to the function.`ale_y`

: the ALE function value calculated for that category. For bootstrapped ALE, this is the same as`ale_y_mean`

by default or`ale_y_median`

if the`boot_centre = 'median'`

argument is specified.`ale_y_lo`

and`ale_y_hi`

: the lower and upper confidence intervals for the bootstrapped`ale_y`

value.

By default, the `ale`

package centres ALE values on the
median of the outcome variable; in our dataset, the median of all the
schools’ average mathematics achievement scores is 12.9. With ALE
centred on the median, the weighted sum of ALE y values (weighted on
`ale_n`

) above the median is approximately equal to the
weighted sum of those below the median. So, in the ALE plots above, when
you consider the number of instances indicated by the rug plots and
category percentages, the average weighted ALE y approximately equals
the median.

Here is the ALE data structure for a numeric variable,
`academic_ratio`

:

```
mb_gam_math$ale$data$academic_ratio
#> # A tibble: 65 × 7
#> ale_x ale_n ale_y ale_y_lo ale_y_mean ale_y_median ale_y_hi
#> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 0 1 8.75 5.61 8.75 8.10 12.3
#> 2 0.05 2 10.9 8.71 10.9 10.8 12.8
#> 3 0.09 1 11.8 10.9 11.8 12.0 12.6
#> 4 0.1 2 11.9 11.0 11.9 12.0 12.9
#> 5 0.13 1 12.5 11.6 12.5 12.5 13.7
#> 6 0.14 2 12.3 11.3 12.3 12.3 13.4
#> 7 0.17 1 12.5 11.6 12.5 12.4 13.4
#> 8 0.18 4 12.4 11.6 12.4 12.4 13.3
#> 9 0.19 3 12.4 11.6 12.4 12.5 13.2
#> 10 0.2 3 12.4 11.6 12.4 12.5 13.2
#> # ℹ 55 more rows
```

The columns are the same as with a categorical variable, but the
meaning of `ale_x`

is different since there are no
categories. To calculate ALE for numeric variables, the range of x
values is divided into fixed intervals (by default 100, customizable
with the `x_intervals`

argument). If the x values have fewer
than 100 distinct values in the data, then each distinct value becomes
an ale_x interval. (This is often the case with smaller datasets like
ours; here `academic_ratio`

has only 65 distinct values.) If
there are more than 100 distinct values, then the range is divided into
100 percentile groups. So, `ale_x`

represents each of these
x-variable intervals. The other columns mean the same thing as with
categorical variables: `ale_n`

is the number of rows of data
in each `ale_x`

interval and `ale_y`

is the
calculated ALE for that `ale_x`

value.

In a bootstrapped ALE plot, values within the confidence intervals
are statistically significant; values outside of the ALER band can be
considered at least somewhat meaningful. Thus, **the essence of
ALE-based statistical inference is that only effects that are
simultaneously within the confidence intervals AND outside of the ALER
band should be considered conceptually meaningful.**

We can see this, for example, with the plot of
`mean_ses`

:

It might not always be easy to tell from a plot which regions are
relevant, so the results of statistical significance are summarized with
the `ale$conf_regions$by_term`

element, which can be accessed
for each variable from its `by_term`

element:

```
mb_gam_math$ale$conf_regions$by_term$mean_ses
#> # A tibble: 10 × 9
#> start_x end_x x_span n n_pct start_y end_y trend relative_to_mid
#> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <ord>
#> 1 -1.19 -1.04 0.0718 2 0.0125 6.39 7.66 1.19 below
#> 2 -0.792 -0.588 0.101 14 0.0875 10.5 11.2 0.510 overlap
#> 3 -0.517 -0.517 0 2 0.0125 11.3 11.3 0 below
#> 4 -0.511 0.569 0.535 130 0.812 11.4 14.6 0.405 overlap
#> 5 0.617 0.617 0 1 0.00625 14.8 14.8 0 above
#> 6 0.633 0.657 0.0119 3 0.0188 14.8 15.0 1.41 overlap
#> 7 0.666 0.688 0.0109 4 0.025 15.0 15.2 1.08 above
#> 8 0.718 0.718 0 1 0.00625 15.0 15.0 0 overlap
#> 9 0.759 0.759 0 2 0.0125 15.2 15.2 0 above
#> 10 0.831 0.831 0 1 0.00625 14.5 14.5 0 overlap
```

For numeric variables, the confidence regions summary has one row for each consecutive sequence of x values that have the same status: all values in the region are below the middle irrelevance band, they overlap the band, or they are all above the band. Here are the summary components:

`start_x`

is the first and`end_x`

is the last x value in the sequence.`start_y`

is the y value that corresponds to`start_x`

while`end_y`

corresponds to`end_x`

.`n`

is the number of data elements in the sequence;`n_pct`

is the percentage of total data elements out of the total number.`x_span`

is the length of x of the sequence that has the same confidence status. However, so that it may be comparable across variables with different units of x,`x_span`

is expressed as a percentage of the full domain of x values.`trend`

is the average slope from the point`(start_x, start_y)`

to`(end_x, end_y)`

. Because only the start and end points are used to calculate`trend`

, it does not reflect any ups and downs that might occur between those two points. Since the various x values in a dataset are on different scales, the scales of the x and y values in calculating the`trend`

are normalized on a scale of 100 each so that the trends for all variables are directly comparable. A positive`trend`

means that, on average, y increases with x; a negative`trend`

means that, on average, y decreases with x; a zero`trend`

means that y has the same value at its start and end points–this is always the case if there is only one point in the indicated sequence.is the key information here. It indicates if all the values in sequence from`relative_to_mid`

`start_x`

to`end_x`

are below, overlapping, or above the ALER band:- below: the higher limit of the confidence interval of ALE y
(
`ale_y_hi`

) is below the lower limit of the ALER band. - above: the lower limit of the confidence interval of ALE y
(
`ale_y_lo`

) is above the higher limit of the ALER band. - overlap: neither of the first two conditions holds; that is, the
confidence region from
`ale_y_lo`

to`ale_y_hi`

at least partially overlaps the ALER band.

- below: the higher limit of the confidence interval of ALE y
(

These results tell us that, for `mean_ses`

, from -1.19 to
-1.04, ALE is below the median band from 6.1 to 7.6. From -0.792 to
-0.792, ALE overlaps the median band from 10.2 to 10.2. From -0.756 to
-0.674, ALE is below the median band from 10.2 to 10.8. From -0.663 to
-0.663, ALE overlaps the median band from 10.9 to 10.9. From -0.643 to
-0.484, ALE is below the median band from 10.8 to 11.2. From -0.467 to
-0.467, ALE overlaps the median band from 11.5 to 11.5. From -0.46 to
-0.46, ALE is below the median band from 11.4 to 11.4. A few other
regions briefly exceeded the ALER band.-

Interestingly, most of the text of the previous paragraph was
generated automatically by an internal (unexported function)
`ale:::summarize_conf_regions_in_words`

. (Since the function
is not exported, you must use `ale:::`

with three colons, not
just two, if you want to access it.)

```
ale:::summarize_conf_regions_in_words(mb_gam_math$ale$conf_regions$by_term$mean_ses)
#> [1] "From -1.19 to -1.04, ALE is below the median band from 6.39 to 7.66. From -0.792 to -0.588, ALE overlaps the median band from 10.5 to 11.2. From -0.517 to -0.517, ALE is below the median band from 11.3 to 11.3. From -0.511 to 0.569, ALE overlaps the median band from 11.4 to 14.6. From 0.617 to 0.617, ALE is above the median band from 14.8 to 14.8. From 0.633 to 0.657, ALE overlaps the median band from 14.8 to 15. From 0.666 to 0.688, ALE is above the median band from 15 to 15.2. From 0.718 to 0.718, ALE overlaps the median band from 15 to 15. From 0.759 to 0.759, ALE is above the median band from 15.2 to 15.2. From 0.831 to 0.831, ALE overlaps the median band from 14.5 to 14.5."
```

While the wording is rather mechanical, it nonetheless illustrates the potential value of being able to summarize the inferentially relevant conclusions in tabular form.

Confidence region summary tables are available not only for numeric
but also for categorical variables, as we see with `public`

.
Here is its ALE plot again:

And here is its confidence regions summary table:

```
mb_gam_math$ale$conf_regions$by_term$public
#> # A tibble: 2 × 5
#> x n n_pct y relative_to_mid
#> <ord> <int> <dbl> <dbl> <ord>
#> 1 FALSE 70 0.438 13.4 overlap
#> 2 TRUE 90 0.562 12.7 overlap
```

Since we have categories here, there is no start or end positions and
there is no trend. We instead have each `x`

category and its
single ALE `y`

value, with the `n`

and
`n_pct`

of the respective category and
`relative_to_mid`

as before to indicate whether the indicated
category is below, overlaps with, or is above the ALER band.

Again with the help of
`ale:::summarize_conf_regions_in_words`

, these results tell
us that, for `public`

, for FALSE, the ALE of 13.3 overlaps
the ALER band. For TRUE, the ALE of 12.6 overlaps the ALER band.

Again, our random variable `rand_norm`

is particularly
interesting. Here is its ALE plot:

And here is its confidence regions summary table:

```
mb_gam_math$ale$conf_regions$by_term$rand_norm
#> # A tibble: 1 × 9
#> start_x end_x x_span n n_pct start_y end_y trend relative_to_mid
#> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <ord>
#> 1 -2.58 2.26 1 160 1 12.7 12.3 -0.0268 overlap
```

Despite any apparent pattern, we see that from -2.4 to 2.61, ALE overlaps the median band from 12 to 12.8. So, despite the random highs and lows in the bootstrap confidence interval, there is no reason to suppose that the random variable has any effect anywhere in its domain.

We can conveniently summarize all the confidence regions from all
variables that are statistically significant or meaningful by accessing
the `conf_regions$significant`

element:

```
mb_gam_math$ale$conf_regions$significant
#> # A tibble: 11 × 12
#> term x start_x end_x x_span n n_pct y start_y end_y trend
#> <chr> <chr> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 mean_s… <NA> -1.19 -1.04 0.0718 2 0.0125 NA 6.39 7.66 1.19
#> 2 mean_s… <NA> -0.517 -0.517 0 2 0.0125 NA 11.3 11.3 0
#> 3 mean_s… <NA> 0.617 0.617 0 1 0.00625 NA 14.8 14.8 0
#> 4 mean_s… <NA> 0.666 0.688 0.0109 4 0.025 NA 15.0 15.2 1.08
#> 5 mean_s… <NA> 0.759 0.759 0 2 0.0125 NA 15.2 15.2 0
#> 6 minori… <NA> 0 0 0 20 0.125 NA 14.6 14.6 0
#> 7 minori… <NA> 0.0213 0.0213 0 1 0.00625 NA 14.3 14.3 0
#> 8 minori… <NA> 0.0238 0.0238 0 1 0.00625 NA 14.3 14.3 0
#> 9 minori… <NA> 0.434 0.441 0.00721 3 0.0188 NA 11.4 11.1 -2.49
#> 10 minori… <NA> 0.511 0.511 0 2 0.0125 NA 10.3 10.3 0
#> 11 minori… <NA> 0.979 1 0.0213 8 0.05 NA 10.4 9.61 -2.39
#> # ℹ 1 more variable: relative_to_mid <ord>
```

This summary focuses only on the x variables that have meaningful ALE
regions anywhere in their domain. We can also conveniently isolate which
variables have any such meaningful region by extracting the unique
values in the `term`

column:

This is especially useful for analyses with dozens of variables; we can thus quickly isolate and focus on the most meaningful ones.