This vignette addresses the usage of the functions involved in statistical inference and power analysis for the direct and spillover effects in two-stage randomized experiments motivated by the JD data set.

In 2007, the ministry in charge of employment in France launched a public employment integration service contract for young graduates seeking employment. A randomized experiment of this job placement assistance program was conducted and the methods in this package can be used to analyze the data. The following examples focus on two specific outcomes: fixed-term contract of six months or more (LTFC) and permanent contract (PC).

The data set is a subset of the original JD data set and includes the following variables:

`anonale`

: local employment agency

`tempsc_av`

: full-time work (at time of assignment)

`assigned`

: 1 if the individual is assigned to treatment,
0 otherwise

`pct0`

: share of the local population treated

`cdi`

: binary variable for whether the individual works on
a permanent contract, 8 months after the assignment

`cdd6m`

: binary variable for whether the individual works
in CDD (LTFC-time contract) for more than 6 months, 8 months after the
assignment

`emploidur`

: binary variable for whether the individual
works on a permanent or LTFC-term contract for more than 6 months, 8
months after the assignment

`tempsc`

: binary variable for whether the individual works
full time, 8 months after the assignment

`salaire`

: individualâ€™s salary in Euros.

The relevant functions for this analysis are the following:

`ZSRE`

: returns a list of`Z`

the vector of the desired binary treatment assignment variable`YSRE`

: returns a list of`Y`

the vector of the outcomes for a desired variable of interest.`CalAPO`

: returns a list of point estimates and variances for the average potential outcomes, unit level direct effect, marginal direct effect, and unit level spillover effect.`Test2SRE`

: returns the rejection region for the desired test. This function takes in the data, the effect type (i.e.Â direct effect, marginal direct effect, or spillover effect) and outputs the rejection region at the desired significance level.`calpara`

: returns a list of the estimated within-cluster variance, between cluster variance, intra-class correlation coefficient, and average of the assignment vector which are necessary for the`Calsamplesize`

`Calsamplesize`

: returns a list of the necessary total number of clusters in order to achieve a given power level at a given significance level for the three types of effects.

First, import the RCT2 library and load the relevant data set.

```
library(RCT2)
data(jd)
```

In order to calculate a list of point estimates and variances for an
effect of interest, run the `CalAPO`

command. It is necessary
first to create the vector of treatment assignments, `A`

,
which will depend on the study design. In this experiment, there are
three treatment assignment mechanisms with treated probabilities 25%,
50%, and 75% respectively.

Then, run the `CalAPO`

command, which takes in the vector
of treatment assignments, the assignment mechanism vector, and the
vector of outcomes for the variable of interest which is
`Y.LTFC`

in this case. We see that the estimated average
potential outcome for long-term fixed contracts is given by
`Y.hat`

. As stated in the paper, we also have the results for
the estimated direct effects under the three treatment mechanisms
(`ADE.est`

), the estimated marginal direct effect
(`MDE.est`

), and the estimated spillover effects
(`ASE.est`

). We also have the estimated covariance matrices
for the average potential outcomes, the estimated direct effect,
estimated marginal effect, and estimated spillover effects.

```
<- data.frame(jd$assigned, jd$pct0, jd$cdd6m, jd$anonale)
data_LTFC colnames(data_LTFC) <- c("Z", "A", "Y", "id")
<- CalAPO(data_LTFC)
test print(CalAPO(data_LTFC))
```

```
## [[1]]
## Potential Outcome Estimates
## treated group 1 estimate 0.2109006
## control group 1 estimate 0.1953872
## treated group 2 estimate 0.2071030
## control group 2 estimate 0.2027447
## treated group 3 estimate 0.2018187
## control group 3 estimate 0.2243082
##
## $Y.covariance
## [,1] [,2] [,3] [,4] [,5]
## [1,] 9.352489e-05 -1.196691e-05 0.000000e+00 0.000000e+00 0.000000e+00
## [2,] -1.196691e-05 1.034387e-04 0.000000e+00 0.000000e+00 0.000000e+00
## [3,] 0.000000e+00 0.000000e+00 1.147296e-04 2.025355e-05 0.000000e+00
## [4,] 0.000000e+00 0.000000e+00 2.025355e-05 7.940618e-05 0.000000e+00
## [5,] 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 9.680927e-05
## [6,] 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 -3.197198e-05
## [,6]
## [1,] 0.000000e+00
## [2,] 0.000000e+00
## [3,] 0.000000e+00
## [4,] 0.000000e+00
## [5,] -3.197198e-05
## [6,] 2.276049e-04
##
## [[3]]
## Average Direct Effect
## assignment group 1 0.015513434
## assignment group 2 0.004358247
## assignment group 3 -0.022489545
##
## $ADE.covariance
## [,1] [,2] [,3]
## [1,] 0.0002208974 0.0000000000 0.0000000000
## [2,] 0.0000000000 0.0001536287 0.0000000000
## [3,] 0.0000000000 0.0000000000 0.0003883582
##
## [[5]]
## Average Spillover Effect
## treatment group under assignments 1 2 0.003797605
## treatment group under assignments 2 3 0.005284307
## control group under assignments 1 2 -0.007357582
## control group under assignments 2 3 -0.021563484
##
## $ASE.covariance
## [,1] [,2] [,3] [,4]
## [1,] 2.082545e-04 -1.147296e-04 8.286640e-06 -2.025355e-05
## [2,] -1.147296e-04 2.115389e-04 -2.025355e-05 -1.171843e-05
## [3,] 8.286640e-06 -2.025355e-05 1.828448e-04 -7.940618e-05
## [4,] -2.025355e-05 -1.171843e-05 -7.940618e-05 3.070111e-04
##
## [[7]]
## Marginal Direct Effect
## 1 -0.0008726215
##
## $MDE.covariance
## [,1]
## [1,] 8.476492e-05
```

Similarly, we can run this on the permanent contracts.

```
<- data.frame(jd$assigned, jd$pct0, jd$cdi, jd$anonale)
data_perm colnames(data_perm) <- c("Z", "A", "Y", "id")
CalAPO(data_perm)
```

We can also perform hypothesis tests on this data by using the
`Test2SRE`

function. THE `Test2SRE`

function takes
in `Z`

, `A`

, `Y`

, as before, and also
takes in an extra argument `effect`

, where the desired effect
should be specified (either ADE for direct effect, MDE for marginal
direct effect, or ASE for spillover effect). The function returns
`TRUE`

if the hypothesis should be rejected, and
`FALSE`

otherwise. The default significance level is set to
0.05, but may be changed by altering the `alpha`

argument.

`Test2SRE(data_LTFC, effect="MDE", alpha=0.05)`

`## [1] FALSE`

Lastly, we can perform sample size calculations for the sample size
needed for a given power at a given significance level. First, we call
the `calpara`

function to calculate the necessary parameters
for the sample size calculation, including the within-class and between
class variances and the intra-class correlation coefficient. The effect
size and the assignment mechanism also need to be specified based on the
study design. In this case, `mu`

is the effect size and
`qa`

is the vector of probabilities of being assigned to one
of the three assignment mechanisms.

Then, call the `calpara`

command to calculate the
within-class and between class variances, and the intra-class
correlation coefficient.

```
# calculate variances for permanent contract
<- calpara(data_perm)
var.perm
# calculate variances for long term fixed contract
<- calpara(data_LTFC) var.LTFC
```

The elements of the output of `calpara`

can be accessed as
below. For example, to retrieve the total variance of the potential
outcomes for the permanent contracts and long-term fixed contracts, the
following code can be run:

```
<- var.perm$sigma.tot
sigma.perm <- var.LTFC$sigma.tot
sigma.LTFC print(sigma.perm)
```

`## [1] 0.1951648`

Then, we specify the effect size and use the
`Calsamplesize`

function to calculate the appropriate sample
sizes for the permanent contract and the LTFC. The default
`alpha`

(significance level) and `beta`

(power) are
set at 0.05 and 0.2 respectively.

```
### effect size and assignment mechanism
<- 0.03
mu <- rep(1/3,3)
qa
# calculate sample size for the permanent contract
print("Permanent Contract:")
```

`## [1] "Permanent Contract:"`

`print(Calsamplesize(data_LTFC, 0.03, qa, 0.05, 0.2))`

```
## [,1] [,2] [,3]
## Assignment Mechanism 1.0000 2.00000 3.0000
## Number of Clusters 428.4264 96.59406 511.5405
```

```
# calculate sample size for the long term fixed contract
print("Long Term Fixed Contract:")
```

`## [1] "Long Term Fixed Contract:"`

`print(Calsamplesize(data_perm, 0.03, qa, alpha=0.05, beta=0.2))`

```
## [,1] [,2] [,3]
## Assignment Mechanism 1.0000 2.0000 3.0000
## Number of Clusters 515.6595 116.4777 614.2199
```

From the results, we can see the necessary total number of clusters
for each assignment mechanism with size `n.avg`

needed to
detect a specific alternative at a certain power and significance
level.