BLE_SSRS

library(BayesSampling)

Application of the BLE to the Stratified Simple Random Sample design

(From Section 2.3.2 of the “Gonçalves, Moura and Migon: Bayes linear estimation for finite population with emphasis on categorical data”)

In a simple model, where there is no auxiliary variable, and a Stratified Simple Random Sample was taken from the population, we can calculate the Bayes Linear Estimator for the individuals of each strata of the population with the BLE_SSRS() function, which receives the following parameters:

Examples

  1. We will use the TeachingSampling’s BigCity dataset for this example (actually we have to take a sample of size \(10000\) from this dataset so that R can perform the calculations). Imagine that we want to estimate the mean or the total Expenditure of this population, but we know that there is a difference between the rural individuals expenditure mean and the urban ones. After taking a stratified simple random sample of 30 individuals from each area, we want to estimate the real expenditure means, conjugating the sample information with an expert expectation (a priori mean will be \(280\) for the rural area and \(420\) for the urban).
data(BigCity)
end <- dim(BigCity)[1]
s <- seq(from = 1, to = end, by = 1)

set.seed(3)
samp <- sample(s, size = 10000, replace = FALSE)
ordered_samp <- sort(samp)
BigCity_red <- BigCity[ordered_samp,]

Rural <- BigCity_red[which(BigCity_red$Zone == "Rural"),]
Rural_Exp <- Rural$Expenditure
length(Rural_Exp)
#> [1] 4757

Rural_ys <- sample(Rural_Exp, size = 30, replace = FALSE)

Urban <- BigCity_red[which(BigCity_red$Zone == "Urban"),]
Urban_Exp <- Urban$Expenditure
length(Urban_Exp)
#> [1] 5243

Urban_ys <- sample(Urban_Exp, size = 30, replace = FALSE)

The real expenditure means will be the values we want to estimate. In this example we know their real values:

mean(Rural_Exp)
#> [1] 291.978
mean(Urban_Exp)
#> [1] 449.0023

Our design-based estimator for the mean will be the sample mean for each strata:

mean(Rural_ys)
#> [1] 302.5523
mean(Urban_ys)
#> [1] 477.8243

Applying the prior information about the population we can get a better estimate, especially in cases when only a small sample is available:

ys <- c(Rural_ys, Urban_ys)
h <- c(30,30)
N <- c(length(Rural_Exp), length(Urban_Exp))
m <- c(280, 420)
v=c(4*(10.1^4), 10.1^5)
sigma = c(sqrt(4*10^4), sqrt(10^5))

Estimator <- BLE_SSRS(ys, h, N, m, v, sigma)

Our Bayes Linear Estimator for the mean expenditure of each strata:

Estimator$est.beta
#>       Beta
#> 1 292.3850
#> 2 454.9716
Estimator$Vest.beta
#>         V1       V2
#> 1 732.2238    0.000
#> 2   0.0000 2015.967
  1. Example from the help page
ys <- c(2,-1,1.5, 6,10, 8,8)
h <- c(3,2,2)
N <- c(5,5,3)
m <- c(0,9,8)
v <- c(3,8,1)
sigma <- c(1,2,0.5)

Estimator <- BLE_SSRS(ys, h, N, m, v, sigma)
Estimator
#> $est.beta
#>        Beta
#> 1 0.7142857
#> 2 8.3333333
#> 3 8.0000000
#> 
#> $Vest.beta
#>          V1       V2        V3
#> 1 0.2857143 0.000000 0.0000000
#> 2 0.0000000 1.333333 0.0000000
#> 3 0.0000000 0.000000 0.1071429
#> 
#> $est.mean
#>      y_nots
#> 1 0.7142857
#> 2 0.7142857
#> 3 8.3333333
#> 4 8.3333333
#> 5 8.3333333
#> 6 8.0000000
#> 
#> $Vest.mean
#>          V1        V2       V3       V4       V5        V6
#> 1 1.2857143 0.2857143 0.000000 0.000000 0.000000 0.0000000
#> 2 0.2857143 1.2857143 0.000000 0.000000 0.000000 0.0000000
#> 3 0.0000000 0.0000000 5.333333 1.333333 1.333333 0.0000000
#> 4 0.0000000 0.0000000 1.333333 5.333333 1.333333 0.0000000
#> 5 0.0000000 0.0000000 1.333333 1.333333 5.333333 0.0000000
#> 6 0.0000000 0.0000000 0.000000 0.000000 0.000000 0.3571429
#> 
#> $est.tot
#> [1] 68.92857
#> 
#> $Vest.tot
#> [1] 27.5
  1. Example from the help page, but informing sample means instead of sample observations
y1 <- mean(c(2,-1,1.5))
y2 <- mean(c(6,10))
y3 <- mean(c(8,8))
ys <- c(y1, y2, y3)
h <- c(3,2,2)
N <- c(5,5,3)
m <- c(0,9,8)
v <- c(3,8,1)
sigma <- c(1,2,0.5)

Estimator <- BLE_SSRS(ys, h, N, m, v, sigma)
#> sample means informed instead of sample observations, parameter 'sigma' will be necessary
Estimator
#> $est.beta
#>        Beta
#> 1 0.7142857
#> 2 8.3333333
#> 3 8.0000000
#> 
#> $Vest.beta
#>          V1       V2        V3
#> 1 0.2857143 0.000000 0.0000000
#> 2 0.0000000 1.333333 0.0000000
#> 3 0.0000000 0.000000 0.1071429
#> 
#> $est.mean
#>      y_nots
#> 1 0.7142857
#> 2 0.7142857
#> 3 8.3333333
#> 4 8.3333333
#> 5 8.3333333
#> 6 8.0000000
#> 
#> $Vest.mean
#>          V1        V2       V3       V4       V5        V6
#> 1 1.2857143 0.2857143 0.000000 0.000000 0.000000 0.0000000
#> 2 0.2857143 1.2857143 0.000000 0.000000 0.000000 0.0000000
#> 3 0.0000000 0.0000000 5.333333 1.333333 1.333333 0.0000000
#> 4 0.0000000 0.0000000 1.333333 5.333333 1.333333 0.0000000
#> 5 0.0000000 0.0000000 1.333333 1.333333 5.333333 0.0000000
#> 6 0.0000000 0.0000000 0.000000 0.000000 0.000000 0.3571429
#> 
#> $est.tot
#> [1] 68.92857
#> 
#> $Vest.tot
#> [1] 27.5