The `matrixNormal`

package in the Comprehensive R Archive
Network (CRAN) [1] consists of two different types of
functions: distribution and matrix ones. First, one can compute
densities, probabilities and quantiles of the Matrix Normal
Distribution. Second, one can perform useful but simple matrix
operations like creating identity matrices and matrices of ones,
calculating the trace of a matrix, and implementing the matrix operator
`vec`

().

You can install the released version of `matrixNormal`

from CRAN[1] with
`install.packages("matrixNormal")`

and can load the package
by:

```
> library(matrixNormal)
: 'matrixNormal'
Attaching package'package:base':
The following object is masked from
I
```

The matrix normal distribution is a generalization of the
multivariate normal distribution to matrix-valued random values. The
parameters consist of a *n x p* mean matrix **M**
and two positive-definite covariance matrices, one for rows
**U** and another for columns **V**. Suppose
that any *n x p* matrix \(A \sim
MatNorm_{n,p}(M, U, V)\). The mean of **A** is
**M**, and the variance of `vec(A)`

is the
Kronecker product of **V** and **U**.

The matrix normal distribution is a conjugate prior of the
coefficients used in multivariate regression. Suppose there are
*p* predictors for *k* dependent variables. In univariate
regression, only one dependent variable exists, so the conjugate prior
for the *k* regression coefficients is the multivariate normal
distribution. For multivariate regression, the extension of the
conjugate prior of the coefficient matrix is the matrix normal
distribution. This package also contains the PDF, CDF, and random number
generation for the matrix normal distribution (*matnorm). See
matrixNormal_Distribution file for more information.

For instance, the USArrests dataset in the dataset package examines statistics in arrests per 100,000 residents for assault, murder, and rape in each of 50 states since 1973. Suppose that a researcher wants to determine whether the outcomes of assault and murder rates are associated with urban population and rape. The researcher then decides to use a Bayesian multiple linear regression and makes the assumption that the states are independent. However, the covariance between assault and murder is nonzero and needs to be taken into account. In fact, it has a correlation of 0.411 as given below.

```
> library(datasets)
> data(USArrests)
> X <- cbind(USArrests$Assault, USArrests$Murder)
> Y <- cbind(USArrests$UrbanPop, USArrests$Rape)
> cor(Y)
1] [,2]
[,1,] 1.0000000 0.4113412
[2,] 0.4113412 1.0000000 [
```

We can assume that the outcome **Y** follows a matrix
normal distribution with mean matrix *M*, which is the matrix
product of the coefficient matrix, \(\Psi\), times **X**. The
covariance across states is assumed to be independent, and the
covariance across the predictors be \(\Sigma\). Or succinctly, with \(k=2\) outcomes, \[ Y \sim MatNorm_{nx2}( M = X\Psi, U = I(n), V =
\Sigma_2) \]

of coefficient matrix times **X**, and covariance across
the predictors \(\Phi\). The overall
covariance matrix of **Y**, \(\Sigma\), is a block diagonal matrix of
\(\Phi\). For instance, suppose that Y
has the following distribution, and if we know the parameters, we can
calculate its density using `dmatnorm()`

.

```
> # Y is n = 50 x p = 2 that follows a matrix normal with mean matrix M, which is product
> # of
> M <- (100 * toeplitz(50:1))[, 1:2]
> dim(M)
1] 50 2
[> head(M)
1] [,2]
[,1,] 5000 4900
[2,] 4900 5000
[3,] 4800 4900
[4,] 4700 4800
[5,] 4600 4700
[6,] 4500 4600
[> U <- I(50) # Covariance across states: Assumed to be independent
> U[1:5, 1:5]
1 2 3 4 5
1 1 0 0 0 0
2 0 1 0 0 0
3 0 0 1 0 0
4 0 0 0 1 0
5 0 0 0 0 1
> V <- cov(X) # Covariance across predictors
> V
1] [,2]
[,1,] 6945.1657 291.06237
[2,] 291.0624 18.97047
[> # Find the density if Y has the density with these arguments.
> matrixNormal::dmatnorm(Y, M, U, V)
1] -30446783 [
```

The coefficient matrix, \(Psi\), for
urban population and rape has dimensions *2 x 3* for the k = 2
outcomes and the p = 3 predictors. A semi-conjugate prior can be
constructed on \(\Psi\) to be a Matrix
Normal distribution with mean \(\Psi_0\) as a matrix of ones, covariance
across the predictors as \(X'X\),
and with no covariance across states (due to independence). The
conjugate prior for the covariance between the outcomes \(\Sigma\) is the inverse-Wishart
distribution with mean covariance \(\Sigma_0\) and degrees of freedom \(\nu\). After defining the parameters, one
random matrix is generated from this prior.

```
> # Generate a random matrix from this prior. The prior mean of regression matrix
> J(2, 3)
1 2 3
1 1 1 1
2 1 1 1
> # The prior variance between rape and population
> t(X) %*% X
1] [,2]
[,1,] 1798262 80756.0
[2,] 80756 3962.2
[> # The prior variance between regression parameters
> I(3)
1 2 3
1 1 0 0
2 0 1 0
3 0 0 1
> # Random draw for prior would have these values
> A <- matrixNormal::rmatnorm(M = J(2, 3), U = t(X) %*% X, V = I(3))
> A
1 2 3
1,] 1035.447098 588.58263 927.25828
[2,] -3.775848 45.20069 11.69251
[> # Predicted Counts for y can be given as:
> ceiling(rowSums(X %*% A))
1] 602806 671520 750509 485213 704634 520883 280817 607520 855500 539247
[11] 117641 306293 635824 288678 142989 293717 278606 636089 211869 765987
[21] 380376 651222 183837 661639 454608 278410 260460 643573 145535 406048
[31] 727723 648617 860475 114851 306543 385596 405916 270772 444105 712575
[41] 219613 480344 513484 306325 122579 398453 370150 206958 135357 411119 [
```

However, the predicted counts do not have much meaning because the
prior is uninformed from the data. We should use the posterior
distribution to predict **Y**. Iranmanesh 2010 [2] shows the posterior distributions to
be \[ \Psi | \Sigma, Y, X \sim MatNorm_{kxp}(
\frac{\hat{\Psi} +\Psi_0}{2}, (2X^TX)^{-1}, \Sigma)\] and \[ \Sigma | Y, X \sim W^{-1}_p(S^*,
2\nu+n)\] where:

* \(\hat{\Psi}\) is the maximum
likelihood estimate (MLE) for the coefficient \(\Psi\) matrix. * *S* is the MLE for
the covariance matrix \(\Sigma\): \((Y-X\cdot\hat{\Psi})^T*(Y-X\cdot\hat{\Psi})\)
* \(S^{*}\) is the data-adjusted matrix
of inverse Wishart, \[ S^{*} = 2\Sigma_0 + S
+ (\hat{\Psi}-\Psi_0)'(2X^TX)^{-1}(\hat{\Psi}-\Psi_0) \] *
*c* is the dimension of \(\Sigma\) * *n* is the sample size,
the number of rows in **Y** and **X**.

Additional details can be found in Pocuca et al. 2019 [3]. At any rate, the
`matrixNormal`

package can be used in Bayesian Multivariate
Linear Regression.

In the `matrixNormal`

package, useful but simple matrix
operations have been coded:

Filename | Description |
---|---|

is.symmetric.matrix | Is a matrix square? symmetric? positive definite? or positive semi-definite? A tolerance is included here. |

Special_matrices | Creates the identity matrix I and matrix of 1’s
J. |

tr | Calculates the usual trace of a matrix |

vec | Stacks a matrix using matrix operator vec() and has option to keep names. |

vech | Stacks elements of a numeric symmetric matrix A in lower triangular only (using half vectorization, vech()). |

For example, these functions can be applied to the following matrices.

```
> # Make a 3 x 3 Identity matrix
> I(3)
1 2 3
1 1 0 0
2 0 1 0
3 0 0 1
> # Make a 3 x 4 J matrix
> J(3, 4)
1 2 3 4
1 1 1 1 1
2 1 1 1 1
3 1 1 1 1
> # Make a 3 x 3 J matrix
> J(3, 3)
1 2 3
1 1 1 1
2 1 1 1
3 1 1 1
> # Calculate the trace of a J matrix
> tr(J(3, 3)) # Should be 3
1] 3
[> # Stack a matrix (used in distribution functions)
> A <- matrix(c(1:4), nrow = 2, dimnames = list(NULL, c("A", "B")))
> A
A B1,] 1 3
[2,] 2 4
[> vec(A)
1:A 2:A 1:B 2:B
1 2 3 4
> # Test if matrix is symmetric (used in distribution function)
> is.symmetric.matrix(A)
1] "A is not symmetric. Top of the matrix: "
[
A B1,] 1 3
[2,] 2 4
[1] FALSE [
```

Although other packages on CRAN have some matrixNormal functionality,
this package provides a general approach in randomly sampling a matrix
normal random variate. The `MBSP::matrix_normal`

,
`matrixsampling::rmatrixnormal`

, and
`LaplacesDemon::rmatrixnorm`

functions also randomly sample
from the matrix normal distribution [4–6]. The function in `MBSP`

uses Cholesky decomposition of individual matrices **U**
and **V** [4]. Similarly, the function in matrixsampling
uses the Spectral decomposition of the individual matrices
**U** and **V** [5]. Comparatively, the new function,
`rmatnorm()`

, in `matrixNormal`

package is
flexible in the decomposition of the covariance matrix, which is
Kronecker product of **U** and **V**. While
you can simulate many samples using the `rmatrixnormal()`

function from `matrixsampling`

package, the new
`rmatnorm`

() function only generates one variate, but can
generate many samples by placing `rmatnorm`

() function inside
a for-loop. The `LaplacesDemon`

package also has a density function, `dmatrixnorm`

(), which
calculates the log determinant of Cholesky decomposition of a positive
definite matrix [6]. However, the random matrix
**A** that follows a matrix normal distribution does not
need to be positive definite [2]. There is no such restriction on the
random matrix in `matrixNormal`

package.

In conclusion, the `matrixNormal`

package collects all
forms of the Matrix Normal Distribution in one place: calculating the
PDF and CDF of the Matrix Normal distribution and simulating a random
variate from this distribution. The package allows the users to be
flexible in finding the random variate. Its main application in using
this package is the Bayesian multivariate regression.

This vignette is successfully processed using the following.

```
-- Session info ---------------------------------------------------
setting value
version R version 4.2.1 (2022-06-23)
os macOS Big Sur ... 10.16
system x86_64, darwin17.0
ui X11
language (EN)
collate C
ctype en_US.UTF-8
tz America/New_York
date 2022-09-15
pandoc 2.18 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/tools/ (via rmarkdown)
-- Packages -------------------------------------------------------
package * version date (UTC) lib source
LaplacesDemon 16.1.6 2021-07-09 [3] CRAN (R 4.2.0)
matrixcalc 1.0-6 2022-09-14 [3] CRAN (R 4.2.1)
matrixsampling 2.0.0 2019-08-24 [3] CRAN (R 4.2.0)
MBSP 3.0 2022-08-11 [3] CRAN (R 4.2.0)
[1] /private/var/folders/55/n1q58r751yd7x4vqzdc8zk_w0000gn/T/RtmpahjkD8/Rinstbfa639caf962
[2] /private/var/folders/55/n1q58r751yd7x4vqzdc8zk_w0000gn/T/Rtmp5FThaP/temp_libpatha97e5e713adf
[3] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
```

1.

R
Core Team *R: A
Language and Environment for Statistical
Computing*; R Foundation for Statistical
Computing: Vienna, Austria, 2018;

2.

Iranmanesh, A.; Arashi, M.; Tabatabaey, S.M.M.
On Conditional
Applications of Matrix Variate Normal
Distribution. *IJMSI* **2010**,
*5*, 33–43.

3.

Pocuca, N.; Gallaugher, M.P.B.; Clark, K.M.;
McNicholas, P.D. Assessing and visualizing matrix variate normality.
*arXiv: Methodology* **2019**.

4.

Bai,
R.; Ghosh, M. *MBSP:
Multivariate Bayesian Model with Shrinkage
Priors*; 2018;

5.

Laurent, S. *Matrixsampling:
Simulations of Matrix Variate
Distributions*; 2018;

6.

Statisticat; LLC. *LaplacesDemon:
Complete Environment for Bayesian
Inference*; Bayesian-Inference.com,
2018;