Contents

1 Welcome

Welcome to the SRTsim project! It is composed of:

The web application allows you to design spatial pattern and generate SRT data with patterns of interest.

2 Install SRTsim

R is an open-source statistical environment which can be easily modified to enhance its functionality via packages. SRTsim is a R package available via CRAN. R can be installed on any operating system from CRAN after which you can install SRTsim by using the following commands in your R session:

 install.packages("SRTsim")

3 Run Reference-Based Simulation

To get started, please load the SRTsim package.

library("SRTsim")

Once you have installed the package, we can perform reference-based Tissue-wise simulation with the example data.

## explore example SRT data 
str(exampleLIBD)
#> List of 2
#>  $ count:Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
#>   .. ..@ i       : int [1:241030] 1 2 8 9 10 11 13 14 15 16 ...
#>   .. ..@ p       : int [1:3612] 0 67 122 182 252 322 392 462 534 609 ...
#>   .. ..@ Dim     : int [1:2] 80 3611
#>   .. ..@ Dimnames:List of 2
#>   .. .. ..$ : chr [1:80] "ENSG00000175130" "ENSG00000159176" "ENSG00000168314" "ENSG00000080822" ...
#>   .. .. ..$ : chr [1:3611] "AAACAAGTATCTCCCA-1" "AAACAATCTACTAGCA-1" "AAACACCAATAACTGC-1" "AAACAGAGCGACTCCT-1" ...
#>   .. ..@ x       : num [1:241030] 1 1 1 7 10 1 5 2 1 1 ...
#>   .. ..@ factors : list()
#>  $ info :'data.frame':   3611 obs. of  6 variables:
#>   ..$ row     : int [1:3611] 50 3 59 14 43 47 73 61 45 42 ...
#>   ..$ col     : int [1:3611] 102 43 19 94 9 13 43 97 115 28 ...
#>   ..$ imagerow: num [1:3611] 381 126 428 187 341 ...
#>   ..$ imagecol: num [1:3611] 441 260 183 417 153 ...
#>   ..$ tissue  : Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 2 2 2 ...
#>   ..$ layer   : chr [1:3611] "Layer3" "Layer1" "WM" "Layer3" ...

example_count   <- exampleLIBD$count
example_loc     <- exampleLIBD$info[,c("imagecol","imagerow","layer")]
colnames(example_loc) <- c("x","y","label")

## create a SRT object
simSRT  <- createSRT(count_in=example_count,loc_in =example_loc)


## Set a seed for reproducible simulation
set.seed(1)

## Estimate model parameters for data generation
simSRT1 <- srtsim_fit(simSRT,sim_schem="tissue")

## Generate synthetic data with estimated parameters
simSRT1 <- srtsim_count(simSRT1)

## Explore the synthetic data
simCounts(simSRT1)[1:5,1:5]
#> 5 x 5 sparse Matrix of class "dgCMatrix"
#>                 AAACAAGTATCTCCCA-1 AAACAATCTACTAGCA-1 AAACACCAATAACTGC-1
#> ENSG00000175130                  .                  .                 10
#> ENSG00000159176                  1                  3                  5
#> ENSG00000168314                  1                  .                  6
#> ENSG00000080822                  .                  .                  3
#> ENSG00000091513                  .                  .                  5
#>                 AAACAGAGCGACTCCT-1 AAACAGCTTTCAGAAG-1
#> ENSG00000175130                  .                  2
#> ENSG00000159176                  .                  1
#> ENSG00000168314                  2                  1
#> ENSG00000080822                  1                  .
#> ENSG00000091513                  1                  3
simcolData(simSRT1)
#> DataFrame with 3611 rows and 3 columns
#>                            x         y       label
#>                    <numeric> <numeric> <character>
#> AAACAAGTATCTCCCA-1   440.639   381.098      Layer3
#> AAACAATCTACTAGCA-1   259.631   126.328      Layer1
#> AAACACCAATAACTGC-1   183.078   427.768          WM
#> AAACAGAGCGACTCCT-1   417.237   186.814      Layer3
#> AAACAGCTTTCAGAAG-1   152.700   341.269      Layer5
#> ...                      ...       ...         ...
#> TTGTTTCACATCCAGG-1   254.410   422.862          WM
#> TTGTTTCATTAGTCTA-1   217.147   433.393          WM
#> TTGTTTCCATACAACT-1   208.416   352.430      Layer6
#> TTGTTTGTATTACACG-1   250.720   503.735          WM
#> TTGTTTGTGTAAATTC-1   284.293   148.110      Layer2

We can perform reference-based Domain-specific simulation with the example data.


## Set a seed for reproducible simulation
set.seed(1)

## Estimate model parameters for data generation
simSRT2 <- srtsim_fit(simSRT,sim_scheme='domain')

## Generate synthetic data with estimated parameters
simSRT2 <- srtsim_count(simSRT2)

## Explore the synthetic data
simCounts(simSRT2)[1:5,1:5]
#> 5 x 5 sparse Matrix of class "dgCMatrix"
#>                 AAACAAGTATCTCCCA-1 AAACAATCTACTAGCA-1 AAACACCAATAACTGC-1
#> ENSG00000175130                  .                  .                 11
#> ENSG00000159176                  1                  2                  7
#> ENSG00000168314                  1                  .                  7
#> ENSG00000080822                  .                  .                  3
#> ENSG00000091513                  .                  .                  6
#>                 AAACAGAGCGACTCCT-1 AAACAGCTTTCAGAAG-1
#> ENSG00000175130                  .                  2
#> ENSG00000159176                  .                  1
#> ENSG00000168314                  2                  1
#> ENSG00000080822                  1                  .
#> ENSG00000091513                  2                  3

4 Comparison Between Reference Data and Synthetic Data

4.1 Summarized Metrics

After data generation, we can compare metrics of reference data and synthetic data


## Compute metrics 
simSRT1   <- compareSRT(simSRT1)

## Visualize Metrics
visualize_metrics(simSRT1)

4.2 Expression Patterns For Genes of Interest

visualize_gene(simsrt=simSRT1,plotgn = "ENSG00000183036",rev_y=TRUE)

visualize_gene(simsrt=simSRT2,plotgn = "ENSG00000168314",rev_y=TRUE)

This work was done by Jiaqiang Zhu, Lulu Shang and Xiang Zhou.

5 Reproducibility

The SRTsim package was made possible thanks to:

Code for creating the vignette

## Create the vignette
library("rmarkdown")
system.time(render("SRTsim.Rmd"))

## Extract the R code
library("knitr")
knit("SRTsim.Rmd", tangle = TRUE)

Date the vignette was generated.

#> [1] "2023-01-02 18:00:00 EST"

Wallclock time spent generating the vignette.

#> Time difference of 18.865 secs

R session information.

#> ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.2.0 (2022-04-22)
#>  os       macOS Big Sur/Monterey 10.16
#>  system   x86_64, darwin17.0
#>  ui       X11
#>  language (EN)
#>  collate  C
#>  ctype    en_US.UTF-8
#>  tz       America/New_York
#>  date     2023-01-02
#>  pandoc   2.18 @ /usr/local/bin/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
#>  package         * version   date (UTC) lib source
#>  abind             1.4-5     2016-07-21 [2] CRAN (R 4.2.0)
#>  assertthat        0.2.1     2019-03-21 [2] CRAN (R 4.2.0)
#>  backports         1.4.1     2021-12-13 [2] CRAN (R 4.2.0)
#>  bezier            1.1.2     2018-12-14 [2] CRAN (R 4.2.0)
#>  BiocGenerics      0.42.0    2022-04-26 [2] Bioconductor
#>  BiocManager       1.30.18   2022-05-18 [2] CRAN (R 4.2.0)
#>  BiocStyle       * 2.24.0    2022-04-26 [2] Bioconductor
#>  bookdown          0.26      2022-04-15 [2] CRAN (R 4.2.0)
#>  broom             0.8.0     2022-04-13 [2] CRAN (R 4.2.0)
#>  bslib             0.3.1     2021-10-06 [2] CRAN (R 4.2.0)
#>  car               3.0-13    2022-05-02 [2] CRAN (R 4.2.0)
#>  carData           3.0-5     2022-01-06 [2] CRAN (R 4.2.0)
#>  class             7.3-20    2022-01-16 [2] CRAN (R 4.2.0)
#>  classInt          0.4-3     2020-04-07 [2] CRAN (R 4.2.0)
#>  cli               3.4.1     2022-09-23 [2] CRAN (R 4.2.0)
#>  codetools         0.2-18    2020-11-04 [2] CRAN (R 4.2.0)
#>  colorRamps        2.3.1     2022-05-02 [2] CRAN (R 4.2.0)
#>  colorspace        2.0-3     2022-02-21 [2] CRAN (R 4.2.0)
#>  concaveman        1.1.0     2020-05-11 [2] CRAN (R 4.2.0)
#>  cowplot           1.1.1     2020-12-30 [2] CRAN (R 4.2.0)
#>  crayon            1.5.1     2022-03-26 [2] CRAN (R 4.2.0)
#>  dashboardthemes   1.1.5     2021-08-21 [2] CRAN (R 4.2.0)
#>  data.table        1.14.2    2021-09-27 [2] CRAN (R 4.2.0)
#>  DBI               1.1.2     2021-12-20 [2] CRAN (R 4.2.0)
#>  deldir            1.0-6     2021-10-23 [2] CRAN (R 4.2.0)
#>  digest            0.6.29    2021-12-01 [2] CRAN (R 4.2.0)
#>  doParallel        1.0.17    2022-02-07 [2] CRAN (R 4.2.0)
#>  dplyr             1.0.9     2022-04-28 [2] CRAN (R 4.2.0)
#>  e1071             1.7-9     2021-09-16 [2] CRAN (R 4.2.0)
#>  ellipsis          0.3.2     2021-04-29 [2] CRAN (R 4.2.0)
#>  evaluate          0.15      2022-02-18 [2] CRAN (R 4.2.0)
#>  fansi             1.0.3     2022-03-24 [2] CRAN (R 4.2.0)
#>  farver            2.1.0     2021-02-28 [2] CRAN (R 4.2.0)
#>  fastmap           1.1.0     2021-01-25 [2] CRAN (R 4.2.0)
#>  FNN               1.1.3.1   2022-05-23 [2] CRAN (R 4.2.0)
#>  foreach           1.5.2     2022-02-02 [2] CRAN (R 4.2.0)
#>  generics          0.1.2     2022-01-31 [2] CRAN (R 4.2.0)
#>  ggplot2           3.4.0     2022-11-04 [2] CRAN (R 4.2.0)
#>  ggpubr            0.4.0     2020-06-27 [2] CRAN (R 4.2.0)
#>  ggsignif          0.6.3     2021-09-09 [2] CRAN (R 4.2.0)
#>  glue              1.6.2     2022-02-24 [2] CRAN (R 4.2.0)
#>  gridExtra         2.3       2017-09-09 [2] CRAN (R 4.2.0)
#>  gtable            0.3.0     2019-03-25 [2] CRAN (R 4.2.0)
#>  highr             0.9       2021-04-16 [2] CRAN (R 4.2.0)
#>  htmltools         0.5.2     2021-08-25 [2] CRAN (R 4.2.0)
#>  htmlwidgets       1.5.4     2021-09-08 [2] CRAN (R 4.2.0)
#>  httpuv            1.6.5     2022-01-05 [2] CRAN (R 4.2.0)
#>  httr              1.4.3     2022-05-04 [2] CRAN (R 4.2.0)
#>  iterators         1.0.14    2022-02-05 [2] CRAN (R 4.2.0)
#>  jquerylib         0.1.4     2021-04-26 [2] CRAN (R 4.2.0)
#>  jsonlite          1.8.0     2022-02-22 [2] CRAN (R 4.2.0)
#>  KernSmooth        2.23-20   2021-05-03 [2] CRAN (R 4.2.0)
#>  knitr             1.39      2022-04-26 [2] CRAN (R 4.2.0)
#>  labeling          0.4.2     2020-10-20 [2] CRAN (R 4.2.0)
#>  later             1.3.0     2021-08-18 [2] CRAN (R 4.2.0)
#>  lattice           0.20-45   2021-09-22 [2] CRAN (R 4.2.0)
#>  lazyeval          0.2.2     2019-03-15 [2] CRAN (R 4.2.0)
#>  lifecycle         1.0.3     2022-10-07 [2] CRAN (R 4.2.0)
#>  lubridate         1.8.0     2021-10-07 [2] CRAN (R 4.2.0)
#>  magick            2.7.3     2021-08-18 [2] CRAN (R 4.2.0)
#>  magrittr          2.0.3     2022-03-30 [2] CRAN (R 4.2.0)
#>  MASS              7.3-57    2022-04-22 [2] CRAN (R 4.2.0)
#>  Matrix            1.4-1     2022-03-23 [2] CRAN (R 4.2.0)
#>  matrixStats       0.62.0    2022-04-19 [2] CRAN (R 4.2.0)
#>  mime              0.12      2021-09-28 [2] CRAN (R 4.2.0)
#>  Morpho            2.9       2021-09-09 [2] CRAN (R 4.2.0)
#>  munsell           0.5.0     2018-06-12 [2] CRAN (R 4.2.0)
#>  pdist             1.2.1     2022-05-02 [2] CRAN (R 4.2.0)
#>  pillar            1.7.0     2022-02-01 [2] CRAN (R 4.2.0)
#>  pkgconfig         2.0.3     2019-09-22 [2] CRAN (R 4.2.0)
#>  plotly            4.10.0    2021-10-09 [2] CRAN (R 4.2.0)
#>  plyr              1.8.7     2022-03-24 [2] CRAN (R 4.2.0)
#>  polyclip          1.10-0    2019-03-14 [2] CRAN (R 4.2.0)
#>  promises          1.2.0.1   2021-02-11 [2] CRAN (R 4.2.0)
#>  proxy             0.4-26    2021-06-07 [2] CRAN (R 4.2.0)
#>  purrr             0.3.4     2020-04-17 [2] CRAN (R 4.2.0)
#>  R6                2.5.1     2021-08-19 [2] CRAN (R 4.2.0)
#>  RColorBrewer      1.1-3     2022-04-03 [2] CRAN (R 4.2.0)
#>  Rcpp              1.0.8.3   2022-03-17 [2] CRAN (R 4.2.0)
#>  RefManageR      * 1.3.0     2020-11-13 [2] CRAN (R 4.2.0)
#>  rgl               0.108.3.2 2022-05-16 [2] CRAN (R 4.2.0)
#>  rlang             1.0.6     2022-09-24 [2] CRAN (R 4.2.0)
#>  rmarkdown         2.14      2022-04-25 [2] CRAN (R 4.2.0)
#>  rstatix           0.7.0     2021-02-13 [2] CRAN (R 4.2.0)
#>  Rvcg              0.21      2022-03-18 [2] CRAN (R 4.2.0)
#>  S4Vectors         0.34.0    2022-04-26 [2] Bioconductor
#>  sass              0.4.1     2022-03-23 [2] CRAN (R 4.2.0)
#>  scales            1.2.0     2022-04-13 [2] CRAN (R 4.2.0)
#>  sessioninfo     * 1.2.2     2021-12-06 [2] CRAN (R 4.2.0)
#>  sf                1.0-7     2022-03-07 [2] CRAN (R 4.2.0)
#>  shiny             1.7.1     2021-10-02 [2] CRAN (R 4.2.0)
#>  shinyBS           0.61.1    2022-04-17 [2] CRAN (R 4.2.0)
#>  shinydashboard    0.7.2     2021-09-30 [2] CRAN (R 4.2.0)
#>  sp                1.4-7     2022-04-20 [2] CRAN (R 4.2.0)
#>  spatstat.data     2.2-0     2022-04-18 [2] CRAN (R 4.2.0)
#>  spatstat.geom     2.4-0     2022-03-29 [2] CRAN (R 4.2.0)
#>  spatstat.random   2.2-0     2022-03-30 [2] CRAN (R 4.2.0)
#>  spatstat.utils    2.3-1     2022-05-06 [2] CRAN (R 4.2.0)
#>  SRTsim          * 0.99.6    2023-01-02 [1] local
#>  stringi           1.7.6     2021-11-29 [2] CRAN (R 4.2.0)
#>  stringr           1.4.0     2019-02-10 [2] CRAN (R 4.2.0)
#>  tibble            3.1.7     2022-05-03 [2] CRAN (R 4.2.0)
#>  tidyr             1.2.0     2022-02-01 [2] CRAN (R 4.2.0)
#>  tidyselect        1.1.2     2022-02-21 [2] CRAN (R 4.2.0)
#>  units             0.8-0     2022-02-05 [2] CRAN (R 4.2.0)
#>  utf8              1.2.2     2021-07-24 [2] CRAN (R 4.2.0)
#>  vctrs             0.5.0     2022-10-22 [2] CRAN (R 4.2.0)
#>  viridis           0.6.2     2021-10-13 [2] CRAN (R 4.2.0)
#>  viridisLite       0.4.0     2021-04-13 [2] CRAN (R 4.2.0)
#>  withr             2.5.0     2022-03-03 [2] CRAN (R 4.2.0)
#>  xfun              0.31      2022-05-10 [2] CRAN (R 4.2.0)
#>  xml2              1.3.3     2021-11-30 [2] CRAN (R 4.2.0)
#>  xtable            1.8-4     2019-04-21 [2] CRAN (R 4.2.0)
#>  yaml              2.3.5     2022-02-21 [2] CRAN (R 4.2.0)
#> 
#>  [1] /private/var/folders/my/31z8ld9s2qd53tvdmv_3w51h0000gn/T/RtmpSS0fzn/Rinst55333efbd326
#>  [2] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

6 Bibliography

This vignette was generated using BiocStyle (Oleś, 2022), knitr (Xie, 2014) and rmarkdown (Allaire, Xie, McPherson, et al., 2022) running behind the scenes.

Citations made with RefManageR (McLean, 2017).

[1] J. Allaire, Y. Xie, J. McPherson, et al. rmarkdown: Dynamic Documents for R. R package version 2.14. 2022. URL: https://github.com/rstudio/rmarkdown.

[2] D. Bates, M. Maechler, and M. Jagan. Matrix: Sparse and Dense Matrix Classes and Methods. R package version 1.4-1. 2022. URL: https://CRAN.R-project.org/package=Matrix.

[3] M. W. McLean. “RefManageR: Import and Manage BibTeX and BibLaTeX References in R”. In: The Journal of Open Source Software (2017). DOI: 10.21105/joss.00338.

[4] A. Oleś. BiocStyle: Standard styles for vignettes and other Bioconductor documents. R package version 2.24.0. 2022. URL: https://github.com/Bioconductor/BiocStyle.

[5] H. Pagès, M. Lawrence, and P. Aboyoun. S4Vectors: Foundation of vector-like and list-like containers in Bioconductor. R package version 0.34.0. 2022. URL: https://bioconductor.org/packages/S4Vectors.

[6] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2022. URL: https://www.R-project.org/.

[7] H. Wickham, W. Chang, R. Flight, et al. sessioninfo: R Session Information. R package version 1.2.2. 2021. URL: https://CRAN.R-project.org/package=sessioninfo.

[8] Y. Xie. “knitr: A Comprehensive Tool for Reproducible Research in R”. In: Implementing Reproducible Computational Research. Ed. by V. Stodden, F. Leisch and R. D. Peng. Chapman and Hall/CRC, 2014. ISBN: 978-1466561595. URL: https://www.routledge.com/Implementing-Reproducible-Research/Stodden-Leisch-Peng/p/book/9781466561595.