Qualitative interaction trees

2022-07-01

Introduction

This package implements in R the Qualitative interaction trees method ( Quint ).

Quint looks for qualitative interactions in two-arm randomized controlled trials using binary regression trees. The algorithm detects subgroups of clients (patients) for which one treatment is optimal while for other subgroups a different treatment is optimal. Also, subgroups that do not differ in treatment effect are detected. The growth of the binary tree, and therefore, the identification of subgroups of clients to a treatment, is defined in terms of pretreatment characteristics.

Further information

For further explanation on the Quint algorithm, R package, and a real example, read the linked article.

Example on how to use Quint in R

In this example we use data from a three-arm randomized controlled trial. The data contain information on women with early-stage breast cancer who were assigned randomly to different treatments. These treatments are a nutrition intervention, an education intervention, and standard care.

Quint is a method for two-arm randomized controlled trials. Therefore, in this example we restrict ourselves to the first two treatments, that is, the nutrition intervention and the education intervention.

data(bcrp)
head(bcrp)
##     physt1 cesdt1   physt3 cesdt3 negsoct1 uncomt1 disopt1 comorbid      age
## 1 37.65374     14 52.62905      4        9      28      14        6 29.48392
## 2 53.64822     10 51.18797     14        7      36      10        2 44.66256
## 3 63.84140      8 66.45392      9        6      29      15        1 43.09925
## 4 38.72757      2 45.99656      4        5      30      17       13 46.93498
## 5 55.85700      0 60.66603      1       10      32      22        1 41.09514
## 6 53.06922      0 54.83928      0        8      24      21        3 44.44627
##   wcht1 nationality marital     trext cond
## 1     1           1       0 0.2589759    3
## 2     1           1       1 0.5557208    1
## 3     1           1       0 0.2589759    2
## 4     1           1       1 0.5557208    2
## 5     0           1       1 0.5557208    2
## 6     1           1       1 0.2589759    3
bcrp2arm<-subset(bcrp,bcrp$cond<3)
head(bcrp2arm)
##     physt1 cesdt1   physt3 cesdt3 negsoct1 uncomt1 disopt1 comorbid      age
## 2 53.64822     10 51.18797     14        7      36      10        2 44.66256
## 3 63.84140      8 66.45392      9        6      29      15        1 43.09925
## 4 38.72757      2 45.99656      4        5      30      17       13 46.93498
## 5 55.85700      0 60.66603      1       10      32      22        1 41.09514
## 7 47.87246      4 57.97079      3        9      22      20        1 45.76044
## 8 48.59693      4 58.66519      2        9      31      16        6 36.84052
##   wcht1 nationality marital      trext cond
## 2     1           1       1  0.5557208    1
## 3     1           1       0  0.2589759    2
## 4     1           1       1  0.5557208    2
## 5     0           1       1  0.5557208    2
## 7     0           1       0 -1.7742771    2
## 8     0           1       1  0.2589759    1

As we have seen, the data contain the baseline measurements (t1) and the 9-month follow-up measurements (t3) and the pretreatment characteristics of the patients. For more information about the data, read the bcrp help file or ask for a summary.

summary(bcrp2arm)
##      physt1          cesdt1           physt3          cesdt3      
##  Min.   :24.05   Min.   : 0.000   Min.   :24.64   Min.   : 0.000  
##  1st Qu.:44.45   1st Qu.: 2.000   1st Qu.:51.20   1st Qu.: 1.000  
##  Median :49.90   Median : 5.000   Median :55.37   Median : 3.000  
##  Mean   :49.52   Mean   : 6.417   Mean   :54.47   Mean   : 4.716  
##  3rd Qu.:56.31   3rd Qu.:10.000   3rd Qu.:58.64   3rd Qu.: 7.000  
##  Max.   :67.40   Max.   :24.000   Max.   :67.43   Max.   :21.000  
##                                   NA's   :20      NA's   :20      
##     negsoct1         uncomt1         disopt1         comorbid     
##  Min.   : 5.000   Min.   :15.00   Min.   : 5.00   Min.   : 0.000  
##  1st Qu.: 6.000   1st Qu.:26.00   1st Qu.:15.00   1st Qu.: 0.000  
##  Median : 7.500   Median :29.00   Median :17.00   Median : 2.000  
##  Mean   : 7.821   Mean   :29.45   Mean   :16.74   Mean   : 2.405  
##  3rd Qu.: 9.000   3rd Qu.:33.00   3rd Qu.:19.00   3rd Qu.: 4.000  
##  Max.   :16.000   Max.   :42.00   Max.   :24.00   Max.   :13.000  
##                                                                   
##       age            wcht1         nationality        marital     
##  Min.   :29.29   Min.   :0.0000   Min.   :0.0000   Min.   :0.000  
##  1st Qu.:41.06   1st Qu.:0.0000   1st Qu.:1.0000   1st Qu.:0.000  
##  Median :44.81   Median :0.0000   Median :1.0000   Median :1.000  
##  Mean   :43.96   Mean   :0.4643   Mean   :0.9286   Mean   :0.744  
##  3rd Qu.:48.17   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.000  
##  Max.   :51.40   Max.   :1.0000   Max.   :1.0000   Max.   :1.000  
##                                                                   
##      trext               cond      
##  Min.   :-1.77428   Min.   :1.000  
##  1st Qu.: 0.25898   1st Qu.:1.000  
##  Median : 0.25898   Median :1.000  
##  Mean   : 0.03413   Mean   :1.494  
##  3rd Qu.: 0.55572   3rd Qu.:2.000  
##  Max.   : 2.58897   Max.   :2.000  
## 

Once we have the data, we need to set up the control parameters of the algorithm of quint. In this case, we use the default values for the following arguments: minimal sample size of patients assigned to treatment 1 and treatment 2 in each leaf (10% of the sample size of each treatment), the weights, and the minimum absolute standarized mean outcome difference (dmin = 0.3).

On the other hand, we set the type of treatment outcome difference used in the partitioning criterion to difference in treatment means, the maximum number of leaves to 5 and we decided to perform the bias-corrected bootstrap procedure using 10 bootstrap samples (the default number is 25).

control1 <- quint.control(crit="dm",maxl = 5,B = 10)

Also, we need to define the formula to be used in quint. This formula should contain first the outcome variable (in this example the difference between the outcome measure at the baseline and at the 9-month follow-up). Next to it, the variable indicating the treatment, and finally the pretreatment characteristics.

formula1<- I(cesdt1-cesdt3) ~ cond | nationality+marital+wcht1+  age+trext+comorbid+disopt1+uncomt1+negsoct1

Now, we are ready to perform quint.

set.seed(2)
quint1<-quint(formula1, data= bcrp2arm, control=control1 )
## Treatment variable (T) equals 1 corresponds to cond = 1 
## Treatment variable (T) equals 2 corresponds to cond = 2 
## The sample size in the analysis is 148 
## split 1 
## #leaves is 2 
## Bootstrap sample  1 
## Bootstrap sample  2 
## Bootstrap sample  3 
## Bootstrap sample  4 
## Bootstrap sample  5 
## Bootstrap sample  6 
## Bootstrap sample  7 
## Bootstrap sample  8 
## Bootstrap sample  9 
## Bootstrap sample  10 
## current value of C 3.233541 
## split 2 
## #leaves is 3 
## Bootstrap sample  1 
## Bootstrap sample  2 
## Bootstrap sample  3 
## Bootstrap sample  4 
## Bootstrap sample  5 
## Bootstrap sample  6 
## Bootstrap sample  7 
## Bootstrap sample  8 
## Bootstrap sample  9 
## Bootstrap sample  10 
## current value of C 3.481599 
## split 3 
## #leaves is 4 
## Bootstrap sample  1 
## Bootstrap sample  2 
## Bootstrap sample  3 
## Bootstrap sample  4 
## Bootstrap sample  5 
## Bootstrap sample  6 
## Bootstrap sample  7 
## Bootstrap sample  8 
## Bootstrap sample  9 
## Bootstrap sample  10 
## current value of C 3.636685 
## split 4 
## #leaves is 5 
## Bootstrap sample  1 
## Bootstrap sample  2 
## Bootstrap sample  3 
## Bootstrap sample  4 
## Bootstrap sample  5 
## Bootstrap sample  6 
## Bootstrap sample  7 
## Bootstrap sample  8 
## Bootstrap sample  9 
## Bootstrap sample  10

After growing the binary tree using quint, we need to prune the tree. This is done using the prune function.

quint1pr <- prune(quint1)
## The sample size in the analysis is 148 
## split 1 
## #leaves is 2 
## current value of C 3.233541 
## split 2 
## #leaves is 3 
## current value of C 3.481599 
## split 3 
## #leaves is 4

Here, if you consider that the standard errors of the mean difference are very small (and therefore the CI), then we should use the function quint.bootstrapCI, which estimates the confidence intervals of the mean difference (up to version 2.1.0 it does not work with effects sizes) in each leaf using a bootstrap-based algorithm.

quint1pr_bootCI <- quint.bootstrapCI(quint1pr,n_boot = 5)
## Treatment variable (T) equals 1 corresponds to cond = 1 
## Treatment variable (T) equals 2 corresponds to cond = 2 
## The sample size in the analysis is 148 
## split 1 
## #leaves is 2 
## Bootstrap sample  1 
## Bootstrap sample  2 
## Bootstrap sample  3 
## Bootstrap sample  4 
## Bootstrap sample  5 
## Bootstrap sample  6 
## Bootstrap sample  7 
## Bootstrap sample  8 
## Bootstrap sample  9 
## Bootstrap sample  10 
## current value of C 3.643239 
## split 2 
## #leaves is 3 
## Bootstrap sample  1 
## Bootstrap sample  2 
## Bootstrap sample  3 
## Bootstrap sample  4 
## Bootstrap sample  5 
## Bootstrap sample  6 
## Bootstrap sample  7 
## Bootstrap sample  8 
## Bootstrap sample  9 
## Bootstrap sample  10 
## current value of C 3.935028 
## split 3 
## #leaves is 4 
## Bootstrap sample  1 
## Bootstrap sample  2 
## Bootstrap sample  3 
## Bootstrap sample  4 
## Bootstrap sample  5 
## Bootstrap sample  6 
## Bootstrap sample  7 
## Bootstrap sample  8 
## Bootstrap sample  9 
## Bootstrap sample  10 
## The sample size in the analysis is 148 
## split 1 
## #leaves is 2 
## current value of C 3.643239 
## split 2 
## #leaves is 3 
## current value of C 3.935028 
## split 3 
## #leaves is 4 
## Treatment variable (T) equals 1 corresponds to cond = 1 
## Treatment variable (T) equals 2 corresponds to cond = 2 
## The sample size in the analysis is 148 
## split 1 
## #leaves is 2 
## Bootstrap sample  1 
## Bootstrap sample  2 
## Bootstrap sample  3 
## Bootstrap sample  4 
## Bootstrap sample  5 
## Bootstrap sample  6 
## Bootstrap sample  7 
## Bootstrap sample  8 
## Bootstrap sample  9 
## Bootstrap sample  10 
## current value of C 3.425667 
## split 2 
## #leaves is 3 
## Bootstrap sample  1 
## Bootstrap sample  2 
## Bootstrap sample  3 
## Bootstrap sample  4 
## Bootstrap sample  5 
## Bootstrap sample  6 
## Bootstrap sample  7 
## Bootstrap sample  8 
## Bootstrap sample  9 
## Bootstrap sample  10 
## current value of C 3.606798 
## split 3 
## #leaves is 4 
## Bootstrap sample  1 
## Bootstrap sample  2 
## Bootstrap sample  3 
## Bootstrap sample  4 
## Bootstrap sample  5 
## Bootstrap sample  6 
## Bootstrap sample  7 
## Bootstrap sample  8 
## Bootstrap sample  9 
## Bootstrap sample  10 
## The sample size in the analysis is 148 
## split 1 
## #leaves is 2 
## current value of C 3.425667 
## split 2 
## #leaves is 3 
## current value of C 3.606798 
## split 3 
## #leaves is 4 
## Treatment variable (T) equals 1 corresponds to cond = 1 
## Treatment variable (T) equals 2 corresponds to cond = 2 
## The sample size in the analysis is 148 
## split 1 
## #leaves is 2 
## Bootstrap sample  1 
## Bootstrap sample  2 
## Bootstrap sample  3 
## Bootstrap sample  4 
## Bootstrap sample  5 
## Bootstrap sample  6 
## Bootstrap sample  7 
## Bootstrap sample  8 
## Bootstrap sample  9 
## Bootstrap sample  10 
## current value of C 3.591496 
## split 2 
## #leaves is 3 
## Bootstrap sample  1 
## Bootstrap sample  2 
## Bootstrap sample  3 
## Bootstrap sample  4 
## Bootstrap sample  5 
## Bootstrap sample  6 
## Bootstrap sample  7 
## Bootstrap sample  8 
## Bootstrap sample  9 
## Bootstrap sample  10 
## current value of C 3.993374 
## split 3 
## #leaves is 4 
## splitting process stopped after number of leaves equals 3 because new value of C was not higher than current value of C 
## The sample size in the analysis is 148 
## split 1 
## #leaves is 2 
## current value of C 3.591496 
## split 2 
## #leaves is 3 
## Treatment variable (T) equals 1 corresponds to cond = 1 
## Treatment variable (T) equals 2 corresponds to cond = 2 
## The sample size in the analysis is 148 
## split 1 
## #leaves is 2 
## Bootstrap sample  1 
## Bootstrap sample  2 
## Bootstrap sample  3 
## Bootstrap sample  4 
## Bootstrap sample  5 
## Bootstrap sample  6 
## Bootstrap sample  7 
## Bootstrap sample  8 
## Bootstrap sample  9 
## Bootstrap sample  10 
## current value of C 3.511183 
## split 2 
## #leaves is 3 
## Bootstrap sample  1 
## Bootstrap sample  2 
## Bootstrap sample  3 
## Bootstrap sample  4 
## Bootstrap sample  5 
## Bootstrap sample  6 
## Bootstrap sample  7 
## Bootstrap sample  8 
## Bootstrap sample  9 
## Bootstrap sample  10 
## current value of C 3.80894 
## split 3 
## #leaves is 4 
## Bootstrap sample  1 
## Bootstrap sample  2 
## Bootstrap sample  3 
## Bootstrap sample  4 
## Bootstrap sample  5 
## Bootstrap sample  6 
## Bootstrap sample  7 
## Bootstrap sample  8 
## Bootstrap sample  9 
## Bootstrap sample  10 
## The sample size in the analysis is 148 
## split 1 
## #leaves is 2 
## current value of C 3.511183 
## split 2 
## #leaves is 3 
## current value of C 3.80894 
## split 3 
## #leaves is 4 
## Treatment variable (T) equals 1 corresponds to cond = 1 
## Treatment variable (T) equals 2 corresponds to cond = 2 
## The sample size in the analysis is 148 
## split 1 
## #leaves is 2 
## Bootstrap sample  1 
## Bootstrap sample  2 
## Bootstrap sample  3 
## Bootstrap sample  4 
## Bootstrap sample  5 
## Bootstrap sample  6 
## Bootstrap sample  7 
## Bootstrap sample  8 
## Bootstrap sample  9 
## Bootstrap sample  10 
## current value of C 3.4231 
## split 2 
## #leaves is 3 
## Bootstrap sample  1 
## Bootstrap sample  2 
## Bootstrap sample  3 
## Bootstrap sample  4 
## Bootstrap sample  5 
## Bootstrap sample  6 
## Bootstrap sample  7 
## Bootstrap sample  8 
## Bootstrap sample  9 
## Bootstrap sample  10 
## current value of C 3.819557 
## split 3 
## #leaves is 4 
## Bootstrap sample  1 
## Bootstrap sample  2 
## Bootstrap sample  3 
## Bootstrap sample  4 
## Bootstrap sample  5 
## Bootstrap sample  6 
## Bootstrap sample  7 
## Bootstrap sample  8 
## Bootstrap sample  9 
## Bootstrap sample  10 
## The sample size in the analysis is 148 
## split 1 
## #leaves is 2 
## current value of C 3.4231 
## split 2 
## #leaves is 3 
## current value of C 3.819557 
## split 3 
## #leaves is 4

To obtain the most interesting information about the results of the method we should use the summary function.

summary(quint1pr)  #without bootstrap-based confidence intervals 
## Partitioning criterion: Difference in treatment means criterion 
##  
## Fit information: 
##                Criterion 
##                 - - - - 
##  split #leaves apparent biascorrected   se
##      1       2     3.23          2.72 0.16
##      2       3     3.48          2.80 0.16
##      3       4     3.64          2.93 0.11
## 
## Split information: 
##         parentnode childnodes splittingvar splitpoint
## Split 1          1        2,3      disopt1       18.5
## Split 2          2        4,5     negsoct1        5.5
## Split 3          5      10,11      uncomt1       34.5
## 
## Leaf information: 
##        #(T=1) meanY|T=1 SD|T=1 #(T=2) meanY|T=2 SD|T=2  diff   se class
## Leaf 1     11      1.00   3.10      7      3.71   4.75 -2.71 1.84     2
## Leaf 2     35      3.23   5.76     32      1.03   5.49  2.20 1.38     3
## Leaf 3     13      5.08   6.76      7     -6.00   4.04 11.08 2.81     1
## Leaf 4     19     -0.32   4.41     24      1.25   2.69 -1.57 1.09     2
summary(quint1pr_bootCI$tree) #with bootstrap-based confidence intervals 
## Partitioning criterion: Difference in treatment means criterion 
##  
## Fit information: 
##                Criterion 
##                 - - - - 
##  split #leaves apparent biascorrected   se
##      1       2     3.23          2.72 0.16
##      2       3     3.48          2.80 0.16
##      3       4     3.64          2.93 0.11
## 
## Split information: 
##         parentnode childnodes splittingvar splitpoint
## Split 1          1        2,3      disopt1       18.5
## Split 2          2        4,5     negsoct1        5.5
## Split 3          5      10,11      uncomt1       34.5
## 
## Leaf information: 
##        #(T=1) meanY|T=1 SD|T=1 #(T=2) meanY|T=2 SD|T=2  diff   se class
## Leaf 1     11      1.00   3.10      7      3.71   4.75 -2.71 1.67     2
## Leaf 2     35      3.23   5.76     32      1.03   5.49  2.20 2.37     3
## Leaf 3     13      5.08   6.76      7     -6.00   4.04 11.08 3.09     1
## Leaf 4     19     -0.32   4.41     24      1.25   2.69 -1.57 1.21     2

Another form of obtaining information about the solution is to plot the pruned tree.

plot(quint1pr) #without bootstrap-based confidence intervals 

plot(quint1pr_bootCI$tree) #with bootstrap-based confidence intervals 

With these results we can conclude that the first treatment (nutrition) works better than the second treatment (education) for patients with dispositional optimism at baseline (disopt1) below or equal to 18.5, negative social interaction at baseline (negsoct1) greater than 5.5 and unmitigated communion at baseline (uncomt1) greater than 34.5. Treatment 2 (education) works better for patients with disopt1 greater than 18.5, and patients with disopt1 less or equal than 18.5 and negsoct1 less or equal than 5.5. The rest of the patients could be assigned to any of the two treatments as the treatment effect does not differ for them according to quint. We can observe the different standard errors for the two methods, as the confidence intervals computed using bootstrap are larger.

The results might be slightly different for R versions below 3.6.0.