2. Simultaneous Non-negative Matrix Factorization (siNMF and jNMF)

Koki Tsuyuzaki

Laboratory for Bioinformatics Research, RIKEN Center for Biosystems Dynamics Research
k.t.the-answer@hotmail.co.jp

Itoshi Nikaido

Laboratory for Bioinformatics Research, RIKEN Center for Biosystems Dynamics Research

2023-06-22

Introduction

In this vignette we consider approximating multiple non-negative matrices as a product of multiple non-negative low-rank matrices (a.k.a., factor matrices).

Test data available from toyModel. Here, we set two datasets to simulate different situations.

library("nnTensor")
X_Easy <- toyModel("siNMF_Easy")
X_Hard <- toyModel("siNMF_Hard")

In the easy case, you will see that there are three blocks in each matrix. These are set up so that there is a block with a large value in a common row position.

layout(t(1:3))
image(X_Easy$X1, main="X1")
image(X_Easy$X2, main="X2")
image(X_Easy$X3, main="X3")

In the difficult case, you will see that there are three blocks in each matrix but there are also some small matrices to model specific patterns that are not shared across these data matrices.

layout(t(1:3))
image(X_Hard$X1, main="X1")
image(X_Hard$X2, main="X2")
image(X_Hard$X3, main="X3")

siNMF

To decompose \(K\) non-negative matrices (\(X_{1}, X_{2}, \ldots X_{K}\)) simultaneously, simultaneous Non-negative Matrix Factorization (siNMF (Badea 2008; Zhang 2012; Yilmaz 2010)) can be applied. siNMF approximates \(k\)-th non-negative data matrix \(X_{k}\) (\(N \times M_{k}\)) as the matrix product of \(W\) (\(N \times J\)) and \(H_{k}\) (\(J \times M_{k}\)) as follows.

\[ X_{k} \approx W H_{k} \ \mathrm{s.t.}\ W \geq 0, H_{k} \geq 0\ (k=1 \ldots K) \]

Note that \(W\) is shared by \(K\) data matrices but \(H_{k}\) is specific in \(k\)-th data matrix.

siNMF can be performed as follows.

set.seed(123456)
out_siNMF_Easy <- siNMF(X_Easy, algorithm="KL", J=3)
str(out_siNMF_Easy, 2)
## List of 6
##  $ W            : num [1:100, 1:3] 0.0015 0.0016 0.00177 0.00164 0.00156 ...
##  $ H            :List of 3
##   ..$ : num [1:300, 1:3] 4.24 4.4 4.7 5.9 4.2 ...
##   ..$ : num [1:200, 1:3] 2.38e-07 4.08e-01 1.31 1.02 1.48 ...
##   ..$ : num [1:150, 1:3] 548 532 538 551 544 ...
##  $ RecError     : Named num [1:101] 1.00e-09 2.63e+04 1.04e+04 9.48e+03 8.26e+03 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
##  $ TrainRecError: Named num [1:101] 1.00e-09 2.63e+04 1.04e+04 9.48e+03 8.26e+03 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
##  $ TestRecError : Named num [1:101] 1e-09 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
##  $ RelChange    : Named num [1:101] 1.00e-09 5.00e-01 1.53 9.60e-02 1.47e-01 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...

As same as NMF, the values of reconstruction error and relative error indicate that the optimization is converged.

layout(t(1:2))
plot(log10(out_siNMF_Easy$RecError[-1]), type="b", main="Reconstruction Error")
plot(log10(out_siNMF_Easy$RelChange[-1]), type="b", main="Relative Change")

From the common factor matrix \(W\) and specific matrices \(H_{1}\), \(H_{2}\), and \(H_{3}\), each matrix can be reconstructed as follows.

recX1 <- out_siNMF_Easy$W %*% t(out_siNMF_Easy$H[[1]])
recX2 <- out_siNMF_Easy$W %*% t(out_siNMF_Easy$H[[2]])
recX3 <- out_siNMF_Easy$W %*% t(out_siNMF_Easy$H[[3]])

layout(rbind(1:3, 4:6))
image(X_Easy$X1, main="Original Data\n(X1)")
image(X_Easy$X2, main="Original Data\n(X2)")
image(X_Easy$X3, main="Original Data\n(X3)")
image(recX1, main="Reconstructed Data\n(X1, siNMF)")
image(recX2, main="Reconstructed Data\n(X2, siNMF)")
image(recX3, main="Reconstructed Data\n(X3, siNMF)")

siNMF implicitly assumes that all data matrices always contain the same number of common patterns. However, in real data, this assumption is sometimes too strong. For example, when siNMF is applied to the difficult case above, we can find that the detection of common patterns is hampered by the influence of non-common patterns.

set.seed(123456)
out_siNMF_Hard <- siNMF(X_Hard, algorithm="KL", J=3)
str(out_siNMF_Hard, 2)
## List of 6
##  $ W            : num [1:100, 1:3] 0.00116 0.00106 0.00138 0.0013 0.00116 ...
##  $ H            :List of 3
##   ..$ : num [1:300, 1:3] 3.25 3.41 3.64 4.48 3.16 ...
##   ..$ : num [1:200, 1:3] 6.04e-10 1.31e-07 1.32 4.78e-01 8.62e-01 ...
##   ..$ : num [1:150, 1:3] 516 529 536 520 522 ...
##  $ RecError     : Named num [1:101] 1.00e-09 3.38e+04 1.01e+04 9.69e+03 9.04e+03 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
##  $ TrainRecError: Named num [1:101] 1.00e-09 3.38e+04 1.01e+04 9.69e+03 9.04e+03 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
##  $ TestRecError : Named num [1:101] 1e-09 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
##  $ RelChange    : Named num [1:101] 1.00e-09 5.86e-01 2.35 4.12e-02 7.19e-02 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
recX1 <- out_siNMF_Hard$W %*% t(out_siNMF_Hard$H[[1]])
recX2 <- out_siNMF_Hard$W %*% t(out_siNMF_Hard$H[[2]])
recX3 <- out_siNMF_Hard$W %*% t(out_siNMF_Hard$H[[3]])
layout(rbind(1:3, 4:6))
image(X_Hard$X1, main="Original Data\n(X1)")
image(X_Hard$X2, main="Original Data\n(X2)")
image(X_Hard$X3, main="Original Data\n(X3)")
image(recX1, main="Reconstructed Data\n(X1, siNMF)")
image(recX2, main="Reconstructed Data\n(X2, siNMF)")
image(recX3, main="Reconstructed Data\n(X3, siNMF)")

jNMF

To overcome the above weakness of siNMF, we next introduce joint NMF (jNMF (Yang 2016)). In jNMF, a common factor matrix \(W\), multiple specific factor matrices \(V_{1}\), \(V_{2}\), …, \(V_{K}\) (\(K\) is the number of matrices), and multiple specific factor matrices \(H_{1}\), \(H_{2}\), …, \(H_{K}\) are estimated simultaneously.

\[ X_{k} \approx (W + V_{k})\ H_{k} \ \mathrm{s.t.}\ W \geq 0, V_{k} \geq 0, H_{k} \geq 0\ (k=1 \ldots K) \]

jNMF can be performed as follows.

out_jNMF_Hard <- jNMF(X_Hard, algorithm="KL", J=3)
str(out_jNMF_Hard, 2)
## List of 7
##  $ W            : num [1:100, 1:3] 0.00297 0.00297 0.0035 0.00364 0.00325 ...
##  $ V            :List of 3
##   ..$ : num [1:100, 1:3] 0.0599 0.0584 0.0591 0.057 0.0601 ...
##   ..$ : num [1:100, 1:3] 4.97e-53 1.93e-33 3.08e-24 5.15e-59 3.21e-54 ...
##   ..$ : num [1:100, 1:3] 1.50e-49 5.50e-53 4.88e-42 1.23e-54 1.39e-46 ...
##  $ H            :List of 3
##   ..$ : num [1:300, 1:3] 10.34 12.12 7.08 11.53 10.8 ...
##   ..$ : num [1:200, 1:3] 539 539 537 545 535 ...
##   ..$ : num [1:150, 1:3] 2.38 11.86 10.39 7.77 5.85 ...
##  $ RecError     : Named num [1:101] 1.00e-09 4.76e+07 1.86e+04 1.33e+04 1.09e+04 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
##  $ TrainRecError: Named num [1:101] 1.00e-09 4.76e+07 1.86e+04 1.33e+04 1.09e+04 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
##  $ TestRecError : Named num [1:101] 1e-09 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
##  $ RelChange    : Named num [1:101] 1.00e-09 1.00 2.56e+03 3.95e-01 2.20e-01 ...
##   ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...

Compared to siNMF, jNMF successfully avoids the influence of specific patterns and focuses on only common patterns.

recX1_common <- out_jNMF_Hard$W %*% t(out_jNMF_Hard$H[[1]])
recX2_common <- out_jNMF_Hard$W %*% t(out_jNMF_Hard$H[[2]])
recX3_common <- out_jNMF_Hard$W %*% t(out_jNMF_Hard$H[[3]])

recX1_specific <- out_jNMF_Hard$V[[1]] %*% t(out_jNMF_Hard$H[[1]])
recX2_specific <- out_jNMF_Hard$V[[2]] %*% t(out_jNMF_Hard$H[[2]])
recX3_specific <- out_jNMF_Hard$V[[3]] %*% t(out_jNMF_Hard$H[[3]])

layout(rbind(1:3, 4:6, 7:9))
image(X_Hard$X1, main="Original Data\n(X1)")
image(X_Hard$X2, main="Original Data\n(X2)")
image(X_Hard$X3, main="Original Data\n(X3)")
image(recX1_common, main="Reconstructed Data\n(Common Pattern in X1, jNMF)")
image(recX2_common, main="Reconstructed Data\n(Common Pattern in X2, jNMF)")
image(recX3_common, main="Reconstructed Data\n(Common Pattern in X3, jNMF)")
image(recX1_specific, main="Reconstructed Data\n(Specific Pattern in X1, jNMF)")
image(recX2_specific, main="Reconstructed Data\n(Specific Pattern in X2, jNMF)")
image(recX3_specific, main="Reconstructed Data\n(Specific Pattern in X3, jNMF)")

Session Information

## R version 4.3.0 (2023-04-21)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.2 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Etc/UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] nnTensor_1.2.0
## 
## loaded via a namespace (and not attached):
##  [1] viridis_0.6.3      sass_0.4.6         utf8_1.2.3         generics_0.1.3    
##  [5] tcltk_4.3.0        digest_0.6.31      magrittr_2.0.3     evaluate_0.21     
##  [9] grid_4.3.0         RColorBrewer_1.1-3 fastmap_1.1.1      maps_3.4.1        
## [13] jsonlite_1.8.5     misc3d_0.9-1       gridExtra_2.3      fansi_1.0.4       
## [17] spam_2.9-1         viridisLite_0.4.2  scales_1.2.1       jquerylib_0.1.4   
## [21] cli_3.6.1          rlang_1.1.1        munsell_0.5.0      withr_2.5.0       
## [25] cachem_1.0.8       yaml_2.3.7         tools_4.3.0        dplyr_1.1.2       
## [29] colorspace_2.1-0   ggplot2_3.4.2      vctrs_0.6.2        R6_2.5.1          
## [33] lifecycle_1.0.3    plot3D_1.4         MASS_7.3-60        pkgconfig_2.0.3   
## [37] rTensor_1.4.8      pillar_1.9.0       bslib_0.5.0        gtable_0.3.3      
## [41] glue_1.6.2         Rcpp_1.0.10        fields_14.1        xfun_0.39         
## [45] tibble_3.2.1       tidyselect_1.2.0   highr_0.10         knitr_1.43        
## [49] farver_2.1.1       htmltools_0.5.5    rmarkdown_2.22     labeling_0.4.2    
## [53] tagcloud_0.6       dotCall64_1.0-2    compiler_4.3.0

References

Badea, L. 2008. “Extracting Gene Expression Profiles Common to Colon and Pancreatic Adenocarcinoma Using Simultaneous Nonnegative Matrix Factorization.” Pacific Symposium on Biocomputing, 279–90.
Yang, Z. et al. 2016. “A Non-Negative Matrix Factorization Method for Detecting Modules in Heterogeneous Omics Multi-Modal Data.” Bioinformatics 32(1): 1–8.
Yilmaz, Y. K. et al. 2010. “Probabilistic Latent Tensor Factorization.” IVA/ICA 2010, 346–53.
Zhang, S. et al. 2012. “Discovery of Multi-Dimensional Modules by Integrative Analysis of Cancer Genomic Data.” Nucleic Acids Research 40(19): 9379–91.