This vignette describes a new feature to **BGGM** (`2.0.0`

) that allows for computing network predictability for binary and ordinal data. Currently the available option is Bayesian \(R^2\) (Gelman et al. 2019).

The first example looks at Binary data, consisting of 1190 observations and 6 variables. The data are called `women_math`

and the variable descriptions are provided in **BGGM**.

The model is estimated with

and then predictability is computed

```
r2 <- predictability(fit)
# print
r2
#> BGGM: Bayesian Gaussian Graphical Models
#> ---
#> Metric: Bayes R2
#> Type: binary
#> ---
#> Estimates:
#>
#> Node Post.mean Post.sd Cred.lb Cred.ub
#> 1 0.016 0.012 0.002 0.046
#> 2 0.103 0.023 0.064 0.150
#> 3 0.155 0.030 0.092 0.210
#> 4 0.160 0.021 0.118 0.201
#> 5 0.162 0.022 0.118 0.202
#> 6 0.157 0.028 0.097 0.208
#> ---
```

There are then two options for plotting. The first is with error bars, denoting the credible interval (i.e., `cred`

),

and the second is with a ridgeline plot

In the following, the `ptsd`

data is used (5-level Likert). The variable descriptions are provided in **BGGM**. This is based on the polychoric partial correlations, with \(R^2\) computed from the corresponding correlations (due to the correspondence between the correlation matrix and multiple regression).

The only change is switching type from `"binary`

to `ordinal`

. One important point is the `+ 1`

. This is required because for the ordinal approach the first category must be 1 (in `ptsd`

the first category is coded as 0).

```
r2 <- predictability(fit)
# print
r2
#> BGGM: Bayesian Gaussian Graphical Models
#> ---
#> Metric: Bayes R2
#> Type: ordinal
#> ---
#> Estimates:
#>
#> Node Post.mean Post.sd Cred.lb Cred.ub
#> 1 0.487 0.049 0.394 0.585
#> 2 0.497 0.047 0.412 0.592
#> 3 0.509 0.047 0.423 0.605
#> 4 0.524 0.049 0.441 0.633
#> 5 0.495 0.047 0.409 0.583
#> 6 0.297 0.043 0.217 0.379
#> 7 0.395 0.045 0.314 0.491
#> 8 0.250 0.042 0.173 0.336
#> 9 0.440 0.048 0.358 0.545
#> 10 0.417 0.044 0.337 0.508
#> 11 0.549 0.048 0.463 0.648
#> 12 0.508 0.048 0.423 0.607
#> 13 0.504 0.047 0.421 0.600
#> 14 0.485 0.043 0.411 0.568
#> 15 0.442 0.045 0.355 0.528
#> 16 0.332 0.039 0.257 0.414
#> 17 0.331 0.045 0.259 0.436
#> 18 0.423 0.044 0.345 0.510
#> 19 0.438 0.044 0.354 0.525
#> 20 0.362 0.043 0.285 0.454
#> ---
```

Here is the `error_bar`

plot.

Note that the plot object is a `ggplot`

which allows for further customization (e.g,. adding the variable names, a title, etc.).

It is quite common to compute predictability assuming that the data are Gaussian. In the context of Bayesian GGMs, this was introduced in (Williams 2018). This can also be implemented in **BGGM**.

`type`

is missing which indicates that `continuous`

is the default.

\(R^2\) for binary and ordinal data is computed for the underlying latent variables. This is also the case when `type = "mixed`

(a semi-parametric copula). In future releases, there will be support for predicting the variables on the observed scale.

Gelman, Andrew, Ben Goodrich, Jonah Gabry, and Aki Vehtari. 2019. “R-squared for Bayesian Regression Models.” *American Statistician* 73 (3): 307–9. https://doi.org/10.1080/00031305.2018.1549100.

Williams, Donald R. 2018. “Bayesian Estimation for Gaussian Graphical Models: Structure Learning, Predictability, and Network Comparisons.” *arXiv*. https://doi.org/10.31234/OSF.IO/X8DPR.