Modifying nll functions

library(anovir)

Introduction

The formals for the negative log-likelihood (nll) models in this package contain a list of arguments which provide the information necessary to calculate the nll returned by the function; these arguments include those taking values for the variables to be estimated by maximum likelihood.

Within the nll_function the values of the variables to estimate are assigned to parameters used in calculating the nll. By default, each parameter is defined by a function taking as input the value from a single argument.

Hence in the default form of a nll_function, the number of parameters to calculate equals the number of variables to estimate.

The number of parameters to be calculated by each nll_function cannot be modified. However, the functions defining parameters can be modified to make them depend on values of more than one variable. The advantage of this is to increase the flexibility of nll_functions and the patterns of mortality they can describe.

Modifying nll_basic

The following example illustrates how to modify the default version of nll_basic, such that, instead of estimating the location parameter for mortality due to infection, a2, as a constant it is made a function of the dose of spores to which infected hosts were exposed;

\[ a2 \rightarrow c1 + c2 \cdot \log\left(dose\right) \]

The default form of nll_basic estimates the location parameter for mortality due to infection as a constant (a2); the modified version of nll_basic estimates it as a linear function the dose of spores infected hosts were exposed to c1 + c2 * log(dose). The data are from a study by Lorenz & Koella [1,2].

The formals for nll_basic are,

utils::str(nll_basic)
#> function (a1 = a1, b1 = b1, a2 = a2, b2 = b2, data = data, time = time, 
#>     censor = censor, infected_treatment = infected_treatment, d1 = "Weibull", 
#>     d2 = "Weibull")

The first four arguments are for, a1, b1, a2, b2; the values given to these arguments are the variables to be estimated by maximum likelihood for the default form of nll_basic.

The values of these variables will be assigned to functions describing the location and scale parameters for background mortality and mortality due to infection, respectively.

By default, these parameter functions are functions taking as input the value from a single argument. This can be seen at the top of the body of nll_basic,

head(body(nll_basic), 5)
#> {
#>     pfa1 <- a1
#>     pfb1 <- b1
#>     pfa2 <- a2
#>     pfb2 <- b2
#> }

where the values of a1, b1, a2, b2 are assigned to pfa1, pfb1, pfa2, pfb2, respectively; the prefix 'pf' stands for 'parameter function'.

The values held by these parameter functions are taken as input for the survival functions in the log-likelihood expression of nll_basic.

Default: 4 variables to estimate

    head(data_lorenz, 2)
#>   Infectious.dose Food Sex Spore.Count    t censored d g
#> 1           10000  100   F      425000 13.0        0 1 1
#> 2           10000   50   F       22000  3.5        0 1 1

    # step #1
       m01_prep_function <- function(a1 = a1, b1 = b1, a2 = a2, b2 = b2){
         nll_basic(a1 = a1, b1 = b1, a2 = a2, b2 = b2,
           data = data_lorenz, time = t, censor = censored,
           infected_treatment = g, d1 = 'Gumbel', d2 = 'Weibull')
       }
    # step #2
       m01 <- mle2(m01_prep_function,
                start = list(a1 = 23, b1 = 5, a2 = 3, b2 = 0.2)
                )
       coef(m01)
#>         a1         b1         a2         b2 
#> 23.2158992  4.6749238  3.0198321  0.2107132

Modified: 5 variables to estimate

The following steps make 'pfa2' a function of log(Infectious.dose), with c1 and c2 as variables to estimate;

  # copy/rename 'nll_function' (not obligatory, but recommended)
    nll_basic2 <- nll_basic
  # find/check location of code to be replaced. NB double '[['
    body(nll_basic2)[[4]]
#> pfa2 <- a2
  # replace default code with new code for function
    body(nll_basic2)[[4]] <- substitute(pfa2 <- c1 + c2 * log(data$Infectious.dose))
  # check code
    head(body(nll_basic2), 5)
#> {
#>     pfa1 <- a1
#>     pfb1 <- b1
#>     pfa2 <- c1 + c2 * log(data$Infectious.dose)
#>     pfb2 <- b2
#> }
  # replace argument 'a2' with those for 'c1', 'c2'. NB use of 'alist'
    formals(nll_basic2) <- alist(a1 = a1, b1 = b1, c1 = c1, c2 = c2, b2 = b2,
                            data = data, time = time, censor = censor,
                            infected_treatment = infected_treatment, d1 = "", d2 = "")
  # new analysis: step #1
    m02_prep_function <- function(a1 = a1, b1 = b1, c1 = c1, c2 = c2, b2 = b2){
      nll_basic2(a1 = a1, b1 = b1, c1 = c1, c2 = c2, b2 = b2,
           data = data_lorenz, time = t, censor = censored,
           infected_treatment = g, d1 = 'Gumbel', d2 = 'Weibull')
         }
  # step #2
    m02 <- mle2(m02_prep_function,
                start = list(a1 = 23, b1 = 5, c1 = 4, c2 = -0.1, b2 = 0.2)
                )
    coef(m02)
#>          a1          b1          c1          c2          b2 
#> 23.19404801  4.71806515  3.82690445 -0.07969278  0.18542088
    
  # compare results
    AICc(m01, m02, nobs = 256)
#>      AICc df
#> 1 1536.58  4
#> 2 1516.43  5
  # according to AICc m02 is better than m01  
    

The names of the data frame columns to use can also be modified when changing a functions' formals, thus reducing the code needed when preparing the function in 'step #1'

  # copy/rename nll_function
    nll_basic3 <- nll_basic
    body(nll_basic3)[[4]] <- substitute(pfa2 <- c1 + c2 * log(data$Infectious.dose))

  # replace argument 'a2' with those for 'c1', 'c2', and assign column names  
    formals(nll_basic3) <- alist(a1 = a1, b1 = b1, c1 = c1, c2 = c2, b2 = b2,
                            data = data_lorenz, time = t, censor = censored,
                            infected_treatment = g, d1 = "Gumbel", d2 = "Weibull")
  # new analysis: step #1
    m03_prep_function <- function(a1 = a1, b1 = b1, c1 = c1, c2 = c2, b2 = b2){
      nll_basic3(a1 = a1, b1 = b1, c1 = c1, c2 = c2, b2 = b2)
      }
  # step #2
    m03 <- mle2(m03_prep_function,
                start = list(a1 = 23, b1 = 5, c1 = 4, c2 = -0.1, b2 = 0.2)
                )
    coef(m03)
#>          a1          b1          c1          c2          b2 
#> 23.19404801  4.71806515  3.82690445 -0.07969278  0.18542088
    identical(coef(m02), coef(m03))
#> [1] TRUE

back to top

References

1. Lorenz LM, Koella JC. 2011 The microsporidian parasite vavraia culicis as a potential late life-acting control agent of malaria. Evolutionary Applications 4, 783–790. (doi:10.1111/j.1752-4571.2011.00199.x)

2. Lorenz LM, Koella JC. 2011 Data from: The microsporidian parasite vavraia culicis as a potential late life-acting control agent of malaria. Dryad Digital Repository (doi:10.5061/dryad.2s231)