Usage guidance

library(OptimalGoldstandardDesigns)

Introduction

This package assumes that a hierarchical testing procedure for the three-arm gold-standard non-inferiority design is applied. The first test aims to establish assay sensitivity of the trial. It is a test of superiority of the experimental treatment (T) against the placebo treatment (P). If assay sensitivity is successfully established, the treatment is tested for non-inferiority against the control treatment (C). Individual observations are assumed to be normally distributed, where higher values correspond to better treatment effects. Testing is assumed to be done via Z test statistics.

We highly recommend reading our open-access article (Meis et al., 2022) where the theoretical background of this package is explained.

Some examples from the paper

To showcase the capabilities of this package, we will reproduce some results from the paper in the following.

It should be noted that the results will not completely agree with the results from the paper, as the calculations in the paper used much lower error tolerances and more function evaluations.

To achieve results closer to the results from the paper, you can supply the following options, though this will significantly increase computation times:

  mvnorm_algorithm = mvtnorm::Miwa(
    # steps = 128,
    steps = 4097,
    checkCorr = FALSE,
    maxval = 1000),
  nloptr_opts = list(algorithm = "NLOPT_LN_SBPLX",
                     # xtol_abs = 1e-3,
                     # xtol_rel = 1e-2,
                     # maxeval = 2000,
                     xtol_abs = 1e-10,
                     xtol_rel = 1e-9,
                     maxeval = 2000,
                     print_level = 0)

You may also want to put

print_progress = TRUE

when running code interactively to see the progress of the optimization.

Design from Table 2

The designs from in Table 2 from the paper are optimized to minimize the expected sample size under the alternative hypothesis.

Design 1, \(\beta = 0.2\)

This is (approximately) the first line in Table 2 from the paper:

tab1_D1 <- optimize_design_onestage(
  alpha = .025,
  beta = .2,
  alternative_TP = .4,
  alternative_TC = 0,
  Delta = .2,
  print_progress = FALSE
)
tab1_D1
#> Sample sizes (stage 1): T: 413, P: 125, C: 404
#> Efficacy boundaries (stage 1): Z_TP_e: 1.95996, Z_TC_e: 1.95996
#> Maximum overall sample size: 942
#> Placebo penalty at optimum (kappa * nP): 0.0
#> Objective function value: 942.0
#> Type I error for TP testing: 2.5%
#> Type I error for TC testing: 2.5%
#> Power: 80.2%

Design 2, \(\beta = 0.2\)

This is (approximately) the second line in Table 2 from the paper:

optimize_design_twostage(
  cP1 = tab1_D1$stagec[[1]]$P, # The allocation ratios are enforced to be
  cC1 = tab1_D1$stagec[[1]]$C, # the same as in the optimal single-stage design.
  cT2 = 1,          
  cP2 = tab1_D1$stagec[[1]]$P, 
  cC2 = tab1_D1$stagec[[1]]$C, 
  
  bTP1f = -Inf,     # These two boundary conditions enforce no futility stops.
  bTC1f = -Inf,
  
  beta = 0.2,
  alternative_TP = 0.4,
  alternative_TC = 0,
  Delta = 0.2,
  print_progress = FALSE
)
#> Sample sizes (stage 1): T: 224, P: 68, C: 219
#> Sample sizes (stage 2): T: 224, P: 68, C: 219
#> Efficacy boundaries (stage 1): Z_TP_e: 2.10510, Z_TC_e: 2.27093
#> Futility boundaries (stage 1): Z_TP_f: -Inf, Z_TC_f: -Inf
#> Efficacy boundaries (stage 2): Z_TP_e: 2.27188, Z_TC_e: 2.10568
#> Inverse normal combination test weights (TP): w1: 0.70711, w2: 0.70711
#> Inverse normal combination test weights (TC): w1: 0.70711, w2: 0.70711
#> Maximum overall sample size: 1022
#> Expected sample size (H1): 801.2
#> Expected sample size (H0): 1020.3
#> Expected placebo group sample size (H1): 82.8
#> Expected placebo group sample size (H0): 134.8
#> Objective function value: 801.2
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 0.00%
#> Probability of futility stop (H0): 0.00%
#> Minimum conditional power: 0.00%
#> Power: 80.20%
#> Futility boundaries: nonbinding
#> Futility testing method: always both futility tests

Design 3, \(\beta = 0.2\)

This is (approximately) the third line in Table 2 from the paper:

optimize_design_twostage(
  bTP1f = -Inf,     # These two boundary conditions enforce no futility stops.
  bTC1f = -Inf,
  
  beta = 0.2,
  alternative_TP = 0.4,
  alternative_TC = 0,
  Delta = 0.2,
  print_progress = FALSE
)
#> Sample sizes (stage 1): T: 230, P: 90, C: 224
#> Sample sizes (stage 2): T: 202, P: 106, C: 191
#> Efficacy boundaries (stage 1): Z_TP_e: 2.04997, Z_TC_e: 2.27978
#> Futility boundaries (stage 1): Z_TP_f: -Inf, Z_TC_f: -Inf
#> Efficacy boundaries (stage 2): Z_TP_e: 2.39960, Z_TC_e: 2.09141
#> Inverse normal combination test weights (TP): w1: 0.69161, w2: 0.72227
#> Inverse normal combination test weights (TC): w1: 0.73218, w2: 0.68111
#> Maximum overall sample size: 1043
#> Expected sample size (H1): 787.2
#> Expected sample size (H0): 1040.3
#> Expected placebo group sample size (H1): 103.1
#> Expected placebo group sample size (H0): 193.9
#> Objective function value: 787.2
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 0.00%
#> Probability of futility stop (H0): 0.00%
#> Minimum conditional power: 0.00%
#> Power: 80.06%
#> Futility boundaries: nonbinding
#> Futility testing method: always both futility tests

Design 4, \(\beta = 0.2\)

This is (approximately) the fourth line in Table 2 from the paper:

optimize_design_twostage(
  beta = 0.2,
  alternative_TP = 0.4,
  alternative_TC = 0,
  Delta = 0.2,
  print_progress = FALSE,
  binding_futility = FALSE
)
#> Sample sizes (stage 1): T: 238, P: 84, C: 241
#> Sample sizes (stage 2): T: 201, P: 122, C: 185
#> Efficacy boundaries (stage 1): Z_TP_e: 2.03084, Z_TC_e: 2.27784
#> Futility boundaries (stage 1): Z_TP_f: -0.29297, Z_TC_f: 0.57221
#> Efficacy boundaries (stage 2): Z_TP_e: 2.47898, Z_TC_e: 2.08790
#> Inverse normal combination test weights (TP): w1: 0.66534, w2: 0.74654
#> Inverse normal combination test weights (TC): w1: 0.74431, w2: 0.66783
#> Maximum overall sample size: 1071
#> Expected sample size (H1): 775.4
#> Expected sample size (H0): 672.8
#> Expected placebo group sample size (H1): 97.9
#> Expected placebo group sample size (H0): 109.3
#> Objective function value: 775.4
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.43%
#> Probability of futility stop (H1): 5.33%
#> Probability of futility stop (H0): 77.96%
#> Minimum conditional power: 19.62%
#> Power: 80.01%
#> Futility boundaries: nonbinding
#> Note: Results are presented as if futility boundaries were strictly obeyed.
#> Futility testing method: always both futility tests

Design 5, \(\beta = 0.2\)

This is (approximately) the fourth line in Table 2 from the paper:

optimize_design_twostage(
  beta = 0.2,
  alternative_TP = 0.4,
  alternative_TC = 0,
  Delta = 0.2,
  print_progress = FALSE,
  binding_futility = TRUE
)
#> Sample sizes (stage 1): T: 229, P: 90, C: 231
#> Sample sizes (stage 2): T: 217, P: 107, C: 199
#> Efficacy boundaries (stage 1): Z_TP_e: 2.04659, Z_TC_e: 2.29485
#> Futility boundaries (stage 1): Z_TP_f: 0.23336, Z_TC_f: 0.75795
#> Efficacy boundaries (stage 2): Z_TP_e: 2.40505, Z_TC_e: 2.04331
#> Inverse normal combination test weights (TP): w1: 0.68710, w2: 0.72656
#> Inverse normal combination test weights (TC): w1: 0.72466, w2: 0.68911
#> Maximum overall sample size: 1073
#> Expected sample size (H1): 768.5
#> Expected sample size (H0): 619.9
#> Expected placebo group sample size (H1): 100.2
#> Expected placebo group sample size (H0): 103.5
#> Objective function value: 768.5
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 8.33%
#> Probability of futility stop (H0): 86.28%
#> Minimum conditional power: 34.17%
#> Power: 80.16%
#> Futility boundaries: binding
#> Futility testing method: always both futility tests

Design from Table 3

Next, we will optimize a design under a combination of null and alternative hypothesis.

Design 5, \(\beta = 0.2\), \(\lambda = 0.9\)

This is (approximately) the third line in Table 3 from the paper:

optimize_design_twostage(
  beta = 0.2,
  alternative_TP = 0.4,
  alternative_TC = 0,
  Delta = 0.2,
  print_progress = FALSE,
  binding_futility = TRUE,
  lambda = 0.9
)
#> Sample sizes (stage 1): T: 227, P: 89, C: 231
#> Sample sizes (stage 2): T: 230, P: 98, C: 213
#> Efficacy boundaries (stage 1): Z_TP_e: 2.05198, Z_TC_e: 2.26340
#> Futility boundaries (stage 1): Z_TP_f: 0.85517, Z_TC_f: 0.77016
#> Efficacy boundaries (stage 2): Z_TP_e: 2.34293, Z_TC_e: 2.06018
#> Inverse normal combination test weights (TP): w1: 0.69370, w2: 0.72026
#> Inverse normal combination test weights (TC): w1: 0.71238, w2: 0.70180
#> Maximum overall sample size: 1088
#> Expected sample size (H1): 771.1
#> Expected sample size (H0): 587.6
#> Expected placebo group sample size (H1): 98.2
#> Expected placebo group sample size (H0): 95.6
#> Objective function value: 758.0
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 9.18%
#> Probability of futility stop (H0): 92.17%
#> Minimum conditional power: 43.67%
#> Power: 80.15%
#> Futility boundaries: binding
#> Futility testing method: always both futility tests

Design from Table 4

Now we will optimize a design under the alternative while putting an extra penalty on placebo group sample size.

Design 5, \(\beta = 0.2\), \(\kappa = 0.5\)

This is (approximately) the fourth line in Table 2 from the paper:

optimize_design_twostage(
  beta = 0.2,
  alternative_TP = 0.4,
  alternative_TC = 0,
  Delta = 0.2,
  print_progress = FALSE,
  binding_futility = TRUE,
  kappa = 0.5
)
#> Sample sizes (stage 1): T: 239, P: 75, C: 237
#> Sample sizes (stage 2): T: 211, P: 114, C: 204
#> Efficacy boundaries (stage 1): Z_TP_e: 2.03405, Z_TC_e: 2.25340
#> Futility boundaries (stage 1): Z_TP_f: 0.01742, Z_TC_f: 0.80964
#> Efficacy boundaries (stage 2): Z_TP_e: 2.46906, Z_TC_e: 2.06256
#> Inverse normal combination test weights (TP): w1: 0.65529, w2: 0.75538
#> Inverse normal combination test weights (TC): w1: 0.73076, w2: 0.68263
#> Maximum overall sample size: 1080
#> Expected sample size (H1): 767.5
#> Expected sample size (H0): 624.9
#> Expected placebo group sample size (H1): 89.9
#> Expected placebo group sample size (H0): 90.1
#> Objective function value: 812.4
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 8.59%
#> Probability of futility stop (H0): 85.70%
#> Minimum conditional power: 31.96%
#> Power: 80.09%
#> Futility boundaries: binding
#> Futility testing method: always both futility tests

Design from Table 5

Next, we will optimize a design under a combination of null and alternative hypothesis while including a penalty on the placebo group sample size.

Design 5, \(\beta = 0.2\), \(\lambda = 0.9\), \(\kappa = 1\)

This is (approximately) the seventh line in Table 2 from the paper:

optimize_design_twostage(
  beta = 0.2,
  alternative_TP = 0.4,
  alternative_TC = 0,
  Delta = 0.2,
  print_progress = FALSE,
  binding_futility = TRUE,
  lambda = .9,
  kappa = 1 
)
#> Sample sizes (stage 1): T: 235, P: 71, C: 236
#> Sample sizes (stage 2): T: 222, P: 88, C: 224
#> Efficacy boundaries (stage 1): Z_TP_e: 2.05815, Z_TC_e: 2.26759
#> Futility boundaries (stage 1): Z_TP_f: 0.75006, Z_TC_f: 0.78151
#> Efficacy boundaries (stage 2): Z_TP_e: 2.34303, Z_TC_e: 2.05618
#> Inverse normal combination test weights (TP): w1: 0.67865, w2: 0.73446
#> Inverse normal combination test weights (TC): w1: 0.71693, w2: 0.69714
#> Maximum overall sample size: 1076
#> Expected sample size (H1): 776.6
#> Expected sample size (H0): 584.6
#> Expected placebo group sample size (H1): 83.8
#> Expected placebo group sample size (H0): 77.4
#> Objective function value: 846.6
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 9.26%
#> Probability of futility stop (H0): 91.74%
#> Minimum conditional power: 40.58%
#> Power: 80.07%
#> Futility boundaries: binding
#> Futility testing method: always both futility tests

Other options

Penalizing the maximum sample size

optimize_design_twostage(
  beta = 0.2,
  alternative_TP = 0.4,
  alternative_TC = 0,
  Delta = 0.2,
  print_progress = FALSE,
  eta = 1
)
#> Sample sizes (stage 1): T: 224, P: 84, C: 248
#> Sample sizes (stage 2): T: 190, P: 55, C: 167
#> Efficacy boundaries (stage 1): Z_TP_e: 2.25324, Z_TC_e: 2.52099
#> Futility boundaries (stage 1): Z_TP_f: -0.27777, Z_TC_f: -0.06567
#> Efficacy boundaries (stage 2): Z_TP_e: 2.09262, Z_TC_e: 2.00438
#> Inverse normal combination test weights (TP): w1: 0.76715, w2: 0.64146
#> Inverse normal combination test weights (TC): w1: 0.75346, w2: 0.65750
#> Maximum overall sample size: 968
#> Expected sample size (H1): 800.9
#> Expected sample size (H0): 711.8
#> Expected placebo group sample size (H1): 94.2
#> Expected placebo group sample size (H0): 104.3
#> Objective function value: 1768.9
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.49%
#> Probability of futility stop (H1): 1.30%
#> Probability of futility stop (H0): 62.00%
#> Minimum conditional power: 4.54%
#> Power: 80.13%
#> Futility boundaries: nonbinding
#> Note: Results are presented as if futility boundaries were strictly obeyed.
#> Futility testing method: always both futility tests

Optimizing between group allocation with enforced between stage allocation at 1

optimize_design_twostage(
  cT2 = 1,           # These three boundary conditions enforce a
  cP2 = quote(cP1),  # between-stage allocation ratio of one.
  cC2 = quote(cC1),  # The quote() command is necessary for this to work.
  
  bTP1f = -Inf,      # These two boundary conditions enforce no futility stops.
  bTC1f = -Inf,
  
  beta = 0.2,
  alternative_TP = 0.4,
  alternative_TC = 0,
  Delta = 0.2,
  print_progress = FALSE
)
#> Sample sizes (stage 1): T: 217, P: 87, C: 212
#> Sample sizes (stage 2): T: 217, P: 87, C: 212
#> Efficacy boundaries (stage 1): Z_TP_e: 2.06549, Z_TC_e: 2.28000
#> Futility boundaries (stage 1): Z_TP_f: -Inf, Z_TC_f: -Inf
#> Efficacy boundaries (stage 2): Z_TP_e: 2.34934, Z_TC_e: 2.10025
#> Inverse normal combination test weights (TP): w1: 0.70711, w2: 0.70711
#> Inverse normal combination test weights (TC): w1: 0.70711, w2: 0.70711
#> Maximum overall sample size: 1032
#> Expected sample size (H1): 789.8
#> Expected sample size (H0): 1029.7
#> Expected placebo group sample size (H1): 99.0
#> Expected placebo group sample size (H0): 172.3
#> Objective function value: 789.8
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.50%
#> Probability of futility stop (H1): 0.00%
#> Probability of futility stop (H0): 0.00%
#> Minimum conditional power: 0.00%
#> Power: 80.09%
#> Futility boundaries: nonbinding
#> Futility testing method: always both futility tests

Using a custom objective function

You can replace the default objective function by any quoted expression. In the following example, we optimize the design parameters to minimize the expected squared sample size under the alternative hypothesis. These expressions can make use of internal objects created in the objective evaluation methods, check out the source code of optimize_design_twostage in the optimization_methods.R file for more information. ASN, ASNP, n and final_state_probs could be useful object for crafting a custom objective function.

optimize_design_twostage(
  beta = 0.2,
  alternative_TP = 0.4,
  alternative_TC = 0,
  Delta = 0.2,
  print_progress = FALSE,
  objective = quote((final_state_probs[["H1"]][["TP1E_TC1E"]] + final_state_probs[["H1"]][["TP1F_TC1F"]]) *
                      (n[[1]][["T"]] + n[[1]][["P"]] + n[[1]][["C"]])^2 +
                      (final_state_probs[["H1"]][["TP1E_TC12E"]] + final_state_probs[["H1"]][["TP1E_TC12F"]]) *
                      (n[[1]][["T"]] + n[[1]][["P"]] + n[[1]][["C"]] + n[[2]][["T"]] + n[[2]][["C"]])^2 +
                      (final_state_probs[["H1"]][["TP12F_TC1"]] + final_state_probs[["H1"]][["TP12E_TC12E"]] +
                         final_state_probs[["H1"]][["TP12E_TC12F"]]) *
                      (n[[1]][["T"]] + n[[1]][["P"]] + n[[1]][["C"]] + n[[2]][["T"]] + n[[2]][["P"]] + n[[2]][["C"]])^2)
)
#> Sample sizes (stage 1): T: 265, P: 86, C: 250
#> Sample sizes (stage 2): T: 157, P: 106, C: 178
#> Efficacy boundaries (stage 1): Z_TP_e: 2.04615, Z_TC_e: 2.29343
#> Futility boundaries (stage 1): Z_TP_f: 0.05568, Z_TC_f: 0.61221
#> Efficacy boundaries (stage 2): Z_TP_e: 2.40364, Z_TC_e: 2.06634
#> Inverse normal combination test weights (TP): w1: 0.70160, w2: 0.71257
#> Inverse normal combination test weights (TC): w1: 0.77851, w2: 0.62763
#> Maximum overall sample size: 1042
#> Expected sample size (H1): 776.9
#> Expected sample size (H0): 676.6
#> Expected placebo group sample size (H1): 97.0
#> Expected placebo group sample size (H0): 103.3
#> Objective function value: 636459.1
#> (local) type I error for TP testing: 2.50%
#> (local) type I error for TC testing: 2.45%
#> Probability of futility stop (H1): 4.93%
#> Probability of futility stop (H0): 82.46%
#> Minimum conditional power: 14.48%
#> Power: 80.06%
#> Futility boundaries: nonbinding
#> Note: Results are presented as if futility boundaries were strictly obeyed.
#> Futility testing method: always both futility tests

References

Meis, J, Pilz, M, Herrmann, C, Bokelmann, B, Rauch, G, Kieser, M. Optimization of the two-stage group sequential three-arm gold-standard design for non-inferiority trials. Statistics in Medicine. 2023; 42( 4): 536– 558. doi:10.1002/sim.9630.