mtb: Summary from Multiple Tables

Y. Hsu

2022-10-20

library(mtb)

Background

Assume that for each month, items purchased in each grocery store visit are recorded in a table. At the end of a year, we may want to generate a summary table that shows how many times each item being purchased over the year and also list some summary statistics.

How to use

This is a basic example which shows you how to summarize item frequency from multiple tables.

library(mtb)
head(exdt[[1]])
#>    id     name category1 category2 store quantity unitprice
#> 1:  7  spinach vegetable     fresh     1        1 0.9810773
#> 2:  5     pear     fruit     fresh     1        4 0.8543127
#> 3:  5     pear     fruit     fresh     1        2 1.1581097
#> 4:  9  pumpkin vegetable    canned     2        4 1.2131444
#> 5:  1      egg   protein     fresh     2        2 1.4947858
#> 6:  6 broccoli vegetable     fresh     2        1 0.9342286

This is a basic example which shows you how to create a cross-count table:


head(bill_cross_count(exdt[1:6], id='name', gp=c('category1'), type = 'count' ) )
#>    category1     name tbl_id:1 tbl_id:2 tbl_id:3 tbl_id:4 tbl_id:5 tbl_id:6
#> 1:     dairy     milk        3        4        2        7        5        5
#> 2:     fruit    apple        2        1        5        3        4        8
#> 3:     fruit   orange        1        4        4        3        3        2
#> 4:     fruit     pear        4        1        2        5        6        3
#> 5:   protein      egg        1        2        1        2        3        3
#> 6: vegetable broccoli        1        1        6       NA        6        4

This is a basic example which shows you how to create a cross-count table with conditions:


head(bill_cross_count(exdt[1:6], id='name', gp=c('category1'), type = 'cond', condstr='store==2' ) )
#>    category1     name tbl_id:1 tbl_id:2 tbl_id:3 tbl_id:4 tbl_id:5 tbl_id:6
#> 1:     dairy     milk        1        2        1        3        0        2
#> 2:     fruit    apple        0        1        4        1        3        4
#> 3:     fruit   orange        0        2        1        1        0        1
#> 4:     fruit     pear        1        1        0        1        4        2
#> 5:   protein      egg        1        1        1        1        1        2
#> 6: vegetable broccoli        1        1        3       NA        3        3

This is a basic example which shows you how to create a cross-count table with conditions and total:


head(bill_cross_count(exdt[1:6], id='name', gp=c('category1'), type = 'condwt', condstr='store==1' ) )
#>    category1     name tbl_id:1 tbl_id:2 tbl_id:3 tbl_id:4 tbl_id:5 tbl_id:6
#> 1:     dairy     milk     2(3)     2(4)     1(2)     4(7)     5(5)     3(5)
#> 2:     fruit    apple     2(2)     0(1)     1(5)     2(3)     1(4)     4(8)
#> 3:     fruit   orange     1(1)     2(4)     3(4)     2(3)     3(3)     1(2)
#> 4:     fruit     pear     3(4)     0(1)     2(2)     4(5)     2(6)     1(3)
#> 5:   protein      egg     0(1)     1(2)     0(1)     1(2)     2(3)     1(3)
#> 6: vegetable broccoli     0(1)     0(1)     3(6)     <NA>     3(6)     1(4)

This is a basic example which shows you how to cross-check differences in two table:


head(bill_cross_check(exdt[[1]], exdt[[2]], id=c('category1', 'name','store') ) )
#>    category1   name store tbl_id:1 tbl_id:2  same
#> 1:     dairy   milk     1        2        2  TRUE
#> 2:     dairy   milk     2        1        2 FALSE
#> 3:     fruit  apple     1        2       NA    NA
#> 4:     fruit  apple     2       NA        1    NA
#> 5:     fruit orange     1        1        2 FALSE
#> 6:     fruit orange     2       NA        2    NA