1 Location of vignette source and code.

Because of the length of time needed to run the vignettes, only static vignettes have been included with this package.

The original of the vignettes and the code can be obtained from the GitHub site at https://github.com/cschwarz-stat-sfu-ca/BTSPAS

2 Introduction

This case represents a generalization of the diagonal case considered in a separate vignette. Now, rather than assuming that all recaptures from a release take place in a single recovery stratum, recoveries could take place over multiple recovery strata.

Again, consider an experiment to estimate the number of outgoing smolts on a small river. The run of smolts extends over several weeks. As smolts migrate, they are captured and marked with individually numbered tags and released at the first capture location using, for example, a fishwheel. The migration continues, and a second fishwheel takes a second sample several kilometers down stream. At the second fishwheel, the captures consist of a mixture of marked (from the first fishwheel) and unmarked fish.

The efficiency of the fishwheels varies over time in response to stream flow, run size passing the wheel and other uncontrollable events. So it is unlikely that the capture probabilities are equal over time at either location, i.e. are heterogeneous over time.

We suppose that we can temporally stratify the data into, for example, weeks, where the capture-probabilities are (mostly) homogeneous at each wheel in each week.

But now, we allow tagged animals to be captured in several recovery strata. For example, suppose that in each julian week $$j$$, $$n1[j]$$ fish are marked and released above the rotary screw trap. Of these, $$m2[j,j]$$ are recaptured in julian week $$j$$; $$m2[j,j+1]$$ are recaptured in julian week $$j+1$$; $$m2[j,j+2]$$ are recaptured in julian week $$j+2$$ and so on.

At the same time, $$u2[j]$$ unmarked fish are captured at the screw trap.

This implies that the data can be structured as a non-diagonal array similar to:

Recovery Stratum
tagged    rs1      rs2     rs3 ...rs4                 rsk  rs(k+1)
Marking   ms1    n1[1]  m2[1,1] m2[1,2] m2[1,3] m2[1,4]      0  ...   0      0
Stratum   ms2    n1[2]   0      m2[2,2] m2[2,3] m2[2,4] .... 0  ...   0      0
ms3    n1[3]   0       0      m2[3,3] m2[3,4] ...  0  ...   0      0
...
msk    n1[k]   0       0      0  ...  0            0    m2[k,k] m2[k,k+1]
Newly
Untagged               u2[1]   u2[2]   u2[3]  ...                 u2[k]   u2[k,k+1]
captured  

Here the tagging and recapture events have been stratified in to $$k$$ temporal strata. Marked fish from one stratum tend to spread out and are recaptured over multiple strata. Several additional recovery strata are needed at the end of the experiment to fully capture the final release stratum.

Because the lower diagonal of the recovery matrix is zero, the data can be entered in a shorthand fashion by showing the recoveries in the same stratum as release, the next stratum, etc, up to a maximum number of recovery strata per release.

2.1 Fixing values of $$p$$ or using covariates.

Refer to the vignette on the Diagonal Case for information about fixing values of $$p$$ or modelling $$p$$ using covariates such a stream flow or smoothing $$p$$ using a temporal spline.

3 Example of basic non-diagonal BTSPAS fit.

Here is an example of some raw data that is read in:

demo.data.csv <- textConnection(
"Date        ,     n1  , X0  , X1  , X2  , X3  , X4
1987-04-26  ,      8  ,  0  ,  0  ,  0  ,  0  ,  2
1987-04-27  ,      5  ,  0  ,  0  ,  0  ,  0  ,  0
1987-04-28  ,      6  ,  0  ,  0  ,  0  ,  0  ,  0
1987-04-29  ,     17  ,  0  ,  0  ,  2  ,  1  ,  1
1987-04-30  ,     66  ,  0  ,  1  ,  0  ,  2  ,  3
1987-05-01  ,    193  ,  0  ,  1  ,  7  ,  7  ,  2
1987-05-02  ,     90  ,  0  ,  2  ,  0  ,  0  ,  0
1987-05-03  ,    260  ,  0  ,  0  , 14  ,  6  ,  1
1987-05-04  ,    368  ,  0  ,  9  , 46  ,  4  ,  2
1987-05-05  ,    506  ,  0  , 38  , 33  , 11  ,  0
1987-05-06  ,    317  ,  1  , 27  , 26  ,  3  ,  1
1987-05-07  ,     43  ,  0  ,  4  ,  3  ,  0  ,  2
1987-05-08  ,    259  ,  1  , 42  ,  5  ,  2  ,  0
1987-05-09  ,    259  ,  1  , 32  , 27  ,  1  ,  0
1987-05-10  ,    249  ,  1  , 85  ,  3  ,  1  ,  0
1987-05-11  ,    250  ,  3  , 21  , 19  ,  2  ,  0
1987-05-12  ,    298  , 42  , 16  , 11  ,  9  ,  1
1987-05-13  ,    250  ,  1  ,  7 ,  25  ,  6  ,  4
1987-05-14  ,    193  ,  0  ,  9 ,  18  ,  8  ,  0
1987-05-15  ,    207  ,  0  , 17  , 21  ,  2  ,  0
1987-05-16  ,    175  ,  0  , 18  , 10  ,  1  ,  0
1987-05-17  ,    141  ,  0  , 12  , 14  ,  7  ,  1
1987-05-18  ,    155  ,  0  ,  1  , 19  , 13  ,  6
1987-05-19  ,    123  ,  0  ,  5  , 22  ,  5  ,  0
1987-05-20  ,    128  ,  0  ,  6  , 17  ,  2  ,  1
1987-05-21  ,     72  ,  0  , 11  ,  9  ,  2  ,  0
1987-05-22  ,     57  ,  0  ,  6  ,  8  ,  0  ,  1
1987-05-23  ,     49  ,  0  ,  4  ,  2  ,  1  ,  0
1987-05-24  ,     57,   14  ,  2  ,  1  ,  0  ,  0
1987-05-25  ,     18  ,  0  ,  3  ,  0  ,  0  ,  0
1987-05-26  ,     20  ,  0  ,  3  ,  4  ,  0  ,  0
1987-05-27  ,     16  ,  0  ,  3  ,  0  ,  0  ,  0
1987-05-28  ,     15  ,  0  ,  0  ,  2  ,  0  ,  0
1987-05-29  ,     10  ,  0  ,  1  ,  0  ,  1  ,  0
1987-05-30  ,     13  ,  0  ,  0  ,  2  ,  0  ,  0
1987-05-31  ,      8  ,  0  ,  3  ,  1  ,  0  ,  0
1987-06-01  ,      2  ,  0  ,  1  ,  0  ,  0  ,  0
1987-06-02  ,     23  ,  0  ,  6  ,  0  ,  0  ,  0
1987-06-03  ,     20  ,  0  ,  2  ,  0  ,  0  ,  0
1987-06-04  ,     10  ,  0  ,  4  ,  1  ,  0  ,  0
1987-06-05  ,     10  ,  3  ,  1  ,  0  ,  0  ,  0
1987-06-06  ,      5  ,  0  ,  2  ,  0  ,  0  ,  1
1987-06-07  ,      2  ,  0  ,  0  ,  0  ,  0  ,  0
1987-06-08  ,      2  ,  0  ,  1  ,  0  ,  0  ,  0  ")

print(demo.data)
#>          Date  n1 X0 X1 X2 X3 X4
#> 1  1987-04-26   8  0  0  0  0  2
#> 2  1987-04-27   5  0  0  0  0  0
#> 3  1987-04-28   6  0  0  0  0  0
#> 4  1987-04-29  17  0  0  2  1  1
#> 5  1987-04-30  66  0  1  0  2  3
#> 6  1987-05-01 193  0  1  7  7  2
#> 7  1987-05-02  90  0  2  0  0  0
#> 8  1987-05-03 260  0  0 14  6  1
#> 9  1987-05-04 368  0  9 46  4  2
#> 10 1987-05-05 506  0 38 33 11  0
#> 11 1987-05-06 317  1 27 26  3  1
#> 12 1987-05-07  43  0  4  3  0  2
#> 13 1987-05-08 259  1 42  5  2  0
#> 14 1987-05-09 259  1 32 27  1  0
#> 15 1987-05-10 249  1 85  3  1  0
#> 16 1987-05-11 250  3 21 19  2  0
#> 17 1987-05-12 298 42 16 11  9  1
#> 18 1987-05-13 250  1  7 25  6  4
#> 19 1987-05-14 193  0  9 18  8  0
#> 20 1987-05-15 207  0 17 21  2  0
#> 21 1987-05-16 175  0 18 10  1  0
#> 22 1987-05-17 141  0 12 14  7  1
#> 23 1987-05-18 155  0  1 19 13  6
#> 24 1987-05-19 123  0  5 22  5  0
#> 25 1987-05-20 128  0  6 17  2  1
#> 26 1987-05-21  72  0 11  9  2  0
#> 27 1987-05-22  57  0  6  8  0  1
#> 28 1987-05-23  49  0  4  2  1  0
#> 29 1987-05-24  57 14  2  1  0  0
#> 30 1987-05-25  18  0  3  0  0  0
#> 31 1987-05-26  20  0  3  4  0  0
#> 32 1987-05-27  16  0  3  0  0  0
#> 33 1987-05-28  15  0  0  2  0  0
#> 34 1987-05-29  10  0  1  0  1  0
#> 35 1987-05-30  13  0  0  2  0  0
#> 36 1987-05-31   8  0  3  1  0  0
#> 37 1987-06-01   2  0  1  0  0  0
#> 38 1987-06-02  23  0  6  0  0  0
#> 39 1987-06-03  20  0  2  0  0  0
#> 40 1987-06-04  10  0  4  1  0  0
#> 41 1987-06-05  10  3  1  0  0  0
#> 42 1987-06-06   5  0  2  0  0  1
#> 43 1987-06-07   2  0  0  0  0  0
#> 44 1987-06-08   2  0  1  0  0  0
demo.data$Date <- as.Date(demo.data$Date, "%Y-%m-%d")
demo.data$jday <- as.numeric(format(demo.data$Date, "%j"))

Here the strata are days (rather than weeks). There are 44 release strata, and tagged fish are recovered in the stratum of release plus another 6 strata.

In the first release stratum, a total of 8 fish were tagged and released. No recoveries occurred until 4 days later.

Because the recoveries take place in more strata than releases, the $$u2$$ vector is read in separately. Note that is must be sufficiently long to account for the number of releases plus potential movement, for a length of 48:

demo.data.u2 <- c(0  ,    2  ,    1  ,  2 ,   39 , 226  , 75 , 129 , 120 , 380 ,
921, 1005  , 1181 ,1087 , 1108 ,1685  ,671 ,1766 , 636 , 483 ,
170,  269  ,  212 , 260 ,  154 , 145  , 99 ,  58 ,  74 ,  40 ,
50,   59  ,   40 ,   9 ,   14 ,  13  , 22 ,  24 ,  33 ,  19 ,
12,    7  ,    4 ,   0 ,    0 ,  59  ,  0 ,   0 )

We also separate out the recoveries $$m2$$ into a matrix

demo.data.m2 <- as.matrix(demo.data[,c("X0","X1","X2","X3","X4")])

3.2 Preliminary screening of the data

A pooled-Petersen estimator would add all of the marked, recaptured and unmarked fish to give an estimate of 73,385.

Let us look at the data in more detail:

• In many of the release strata, there are small number of fish tagged and released.
• Most the recoveries occur soon after release, but there is longer right tail.
• It is difficult to estimate capture efficiency directly because of the smearing of releases over multiple recovery strata.

Let us look at the pattern of unmarked fish captured at the second trap:

There appears to be a gradual rise and fall in the number of unmarked fish with no sudden jumps. Something funny appears to have happened around julian day 160 – I suspect that some pooling of data has occurred here.

BTSPAS provides two fitting routine:

• Non-diagonal but the movement of fish after tagging follows a log-normal distribution following release as explained in Dempson and Schwarz (2009) and extended by Bonner and Schwarz (2011).
• Non-diagonal but a non-parametric movement distribution is assumes as explained in Bonner and Schwarz (2011).

4 Fitting the BTSPAS non-diagonal model with a log-normal movement distribution.

Bonner and Schwarz (2011) developed a model with the following features.

• Log-normal distribution of the distribution of times between release and availability at the second trap.
• A spline is used to smooth the total number of unmarked fish presenting themselves at the second trap over the strata
• A hierarchical model for the capture-probabilities is assumed where individual stratum capture probabilities are assumed to vary around a common mean.

The model also allows the user to use covariates to explain some of the variation in the capture probabilities in much the same way as the diagonal case.

The $$BTSPAS$$ package also has additional features and options:

• if $$u2$$ is missing for any stratum, the program will use the spline to interpolate the number of unmarked fish in the population for the missing stratum.
• if $$n1$$ and the entire corresponding row of $$m2$$ are 0, the program will use the hierarchical model to interpolate the capture probabilities for the missing strata because there is no information about recapture probabilities when $$n1=0$$.
• the program allows you specify break points in the underlying spline to account for external events.
• sometimes bad thing happen. The vector $$bad.m2$$ indicates which julian weeks something went wrong. In the above example, the number of recoveries in julian week 41 is far below expectations and leads to impossible Petersen estimate for julian week 41. Similarly, the vector $$bad.u2$$ indicates which julian weeks, the number of unmarked fish is suspect. In both cases, the suspect values of $$n1$$ and $$m2$$ are set to 0 and the suspect values of $$u2$$ are set to missing. Alternatively, the user can set the suspect $$n1$$ and $$m2$$ values to 0,
and the suspect $$u2$$ values to missing in the data input directly. I arbitrarily chose the third julian week to demonstrate this feature.

The $$BTSPAS$$ function also allows you specify

• The prefix is used to identify the output files for this run.
• The title is used to title the output.

We already read in the data above. Here we set the rest of the parameters. Don’t forget to set the working directory as appropriate

library("BTSPAS")

# After which weeks is the spline allowed to jump?
demo.jump.after <- c()  # julian weeks after which jump occurs

# Which julian weeks have "bad" recapture values. These will be set to 0 or missing prior to the model fitting.

# The prefix for the output files:
demo.prefix <- "demo-1987-Conne River-TSP NDE"

# Title for the analysis
demo.title <- "Conne River 1987 Atlantic Salmon Smolts - Log-normal"

cat("*** Starting ",demo.title, "\n\n")
#> *** Starting  Conne River 1987 Atlantic Salmon Smolts - Log-normal

# Make the call to fit the model and generate the output files
demo.fit <- TimeStratPetersenNonDiagError_fit(
title=      demo.title,
prefix=     demo.prefix,
time=       min(demo.data$jday):(min(demo.data$jday)+length(demo.data.u2)-1),
n1=         demo.data$n1, m2= demo.data.m2, u2= demo.data.u2, jump.after= demo.jump.after, bad.n1= demo.bad.n1, bad.m2= demo.bad.m2, bad.u2= demo.bad.u2, debug=TRUE, # save time by reducing number of MCMC iterations save.output.to.files=FALSE) #> #> #> *** Start of call to JAGS #> Working directory: /Users/cschwarz/Dropbox/SPAS-Bayesian/BTSPAS/vignettes #> Initial seed for JAGS set to: 961488 #> Random number seed for chain 979208 #> Random number seed for chain 519731 #> Random number seed for chain 256037 #> Compiling model graph #> Resolving undeclared variables #> Allocating nodes #> Graph information: #> Observed stochastic nodes: 92 #> Unobserved stochastic nodes: 208 #> Total graph size: 11429 #> #> Initializing model #> #> | | | 0% | |++ | 4% | |++++ | 8% | |++++++ | 12% | |++++++++ | 16% | |++++++++++ | 20% | |++++++++++++ | 24% | |++++++++++++++ | 28% | |++++++++++++++++ | 32% | |++++++++++++++++++ | 36% | |++++++++++++++++++++ | 40% | |++++++++++++++++++++++ | 44% | |++++++++++++++++++++++++ | 48% | |++++++++++++++++++++++++++ | 52% | |++++++++++++++++++++++++++++ | 56% | |++++++++++++++++++++++++++++++ | 60% | |++++++++++++++++++++++++++++++++ | 64% | |++++++++++++++++++++++++++++++++++ | 68% | |++++++++++++++++++++++++++++++++++++ | 72% | |++++++++++++++++++++++++++++++++++++++ | 76% | |++++++++++++++++++++++++++++++++++++++++ | 80% | |++++++++++++++++++++++++++++++++++++++++++ | 84% | |++++++++++++++++++++++++++++++++++++++++++++ | 88% | |++++++++++++++++++++++++++++++++++++++++++++++ | 92% | |++++++++++++++++++++++++++++++++++++++++++++++++ | 96% | |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% #> | | | 0% | |** | 4% | |**** | 8% | |****** | 12% | |******** | 16% | |********** | 20% | |************ | 24% | |************** | 28% | |**************** | 32% | |****************** | 36% | |******************** | 40% | |********************** | 44% | |************************ | 48% | |************************** | 52% | |**************************** | 56% | |****************************** | 60% | |******************************** | 64% | |********************************** | 68% | |************************************ | 72% | |************************************** | 76% | |**************************************** | 80% | |****************************************** | 84% | |******************************************** | 88% | |********************************************** | 92% | |************************************************ | 96% | |**************************************************| 100% #> #> #> *** Finished JAGS *** The final parameter (save.output.to.files) can be set to automatically to save plots and reports in files with the appropriate prefix in the working directory. 4.1 The output from the fit The final object has many components names(demo.fit) #> [1] "n.chains" "n.iter" "n.burnin" #> [4] "n.thin" "n.keep" "n.sims" #> [7] "sims.array" "sims.list" "sims.matrix" #> [10] "summary" "mean" "sd" #> [13] "median" "root.short" "long.short" #> [16] "dimension.short" "indexes.short" "last.values" #> [19] "program" "model.file" "isDIC" #> [22] "DICbyR" "pD" "DIC" #> [25] "model" "parameters.to.save" "plots" #> [28] "runTime" "report" "data" The plots sub-object contains many plots: names(demo.fit$plots)
#> [1] "plot.init"         "fit.plot"          "logitP.plot"
#> [4] "acf.Utot.plot"     "post.UNtot.plot"   "musdLogTTt"
#> [7] "gof.plot"          "trace.logitP.plot" "trace.logU.plot"

In particular, it contains plots of the initial spline fit (init.plot), the final fitted spline (fit.plot), the estimated capture probabilities (on the logit scale) (logitP.plot), plots of the distribution of the posterior sample for the total unmarked and marked fish (post.UNtot.plot) and model diagnostic plots (goodness of fit (gof.plot), trace (trace…plot), and autocorrelation plots (act.Utot.plot).

These plots are all created using the $$ggplot2$$ packages, so the user can modify the plot (e.g. change titles etc).

The $$BTSPAS$$ program also creates a report, which includes information about the data used in the fitting, the pooled- and stratified-Petersen estimates, a test for pooling, and summaries of the posterior. Only the first few lines are shown below:

head(demo.fit$report) #> [1] "Time Stratified Petersen with Non-Diagonal recaptures, error in smoothed U, and log-normal distribution for travel time - Sun Oct 24 15:46:59 2021" #> [2] "Version: 2021-11-02" #> [3] "" #> [4] " Conne River 1987 Atlantic Salmon Smolts - Log-normal Results " #> [5] "" #> [6] "*** Raw data *** (padded to match length of u2) " Here is the fitted spline curve to the number of unmarked fish available in each recovery sample demo.fit$plots$fit.plot The distribution of the posterior sample for the total number unmarked and total abundance is available: demo.fit$plots$post.UNtot.plot A plot of the $$logit(P)$$ is demo.fit$plots$logitP.plot In cases where there is little information, $$BTSPAS$$ has shared information based on the distribution of catchability in the other strata. A summary of the posterior for each parameter is also available. In particular, here are the summary statistics on the posterior sample for the total number unmarked and total abundance: demo.fit$summary[ row.names(demo.fit$summary) %in% c("Ntot","Utot"),] #> mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff #> Ntot 75719.74 2982.231 70300.37 73627.25 75617.5 77619.5 81873.02 1.002045 1500 #> Utot 70749.74 2982.231 65330.37 68657.25 70647.5 72649.5 76903.02 1.002028 1500 This also includes the Rubin-Brooks-Gelman statistic ($$Rhat$$) on mixing of the chains and the effective sample size of the posterior (after accounting for autocorrelation). The estimated total abundance from $$BTSPAS$$ is 75,720 (SD 2,982 ) fish. The model has a “base” log-normal distribution for the travel times between the release and recovery strata. The summary of the posterior distribution of its parameters for the log(travel time) are: round(demo.fit$summary[ row.names(demo.fit$summary) %in% c("baseMu","baseSd"),],3) #> mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff #> baseMu 0.684 0.052 0.584 0.651 0.683 0.718 0.787 1.001 1500 #> baseSd 0.281 0.020 0.242 0.267 0.280 0.294 0.321 1.000 1500 Each release stratum is allowed to have a travel time distribution that differ from this base travel time distribution, by allowing the individual release stratum parameters to be sampled from a distribution around the above vales. Posterior samples represent the mean and standard deviation of the log(travel time) between the release and recovery strata. Here are the results for the first 5 strata: round(demo.fit$summary[ grepl("muLogTT", row.names(demo.fit$summary)),][1:5,],3) #> mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff #> muLogTT[1] 1.201 0.205 0.774 1.072 1.216 1.340 1.565 1.004 880 #> muLogTT[2] 0.682 0.303 0.087 0.484 0.676 0.887 1.277 1.003 620 #> muLogTT[3] 0.618 0.307 0.060 0.411 0.596 0.808 1.269 1.000 1500 #> muLogTT[4] 1.116 0.151 0.787 1.030 1.131 1.220 1.379 1.003 1500 #> muLogTT[5] 1.213 0.106 1.001 1.145 1.211 1.283 1.418 1.001 1400 round(demo.fit$summary[ grepl("sdLogTT", row.names(demo.fit$summary)),][1:5,],3) #> mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff #> sdLogTT[1] 0.300 0.100 0.143 0.231 0.288 0.352 0.532 1.000 1500 #> sdLogTT[2] 0.298 0.099 0.148 0.228 0.283 0.346 0.545 1.000 1500 #> sdLogTT[3] 0.290 0.092 0.151 0.223 0.278 0.346 0.509 1.001 1500 #> sdLogTT[4] 0.292 0.080 0.169 0.236 0.279 0.339 0.478 1.001 1500 #> sdLogTT[5] 0.292 0.059 0.202 0.250 0.284 0.323 0.442 1.000 1500 It is also possible to see the probability of moving from release stratum $$i$$ to recovery stratum $$j$$ by looking at the $$Theta[i,j]$$ values. Here are the transition probabilities for the first release stratum: round(demo.fit$summary[ grepl("Theta[1,", row.names(demo.fit$summary),fixed=TRUE),][1:10,],3) #> mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff #> Theta[1,1] 0.005 0.021 0.000 0.000 0.000 0.001 0.056 1.004 1500 #> Theta[1,2] 0.084 0.109 0.000 0.004 0.034 0.127 0.371 1.002 1500 #> Theta[1,3] 0.268 0.143 0.009 0.154 0.281 0.378 0.518 1.001 1500 #> Theta[1,4] 0.330 0.117 0.118 0.252 0.325 0.402 0.575 1.004 1500 #> Theta[1,5] 0.196 0.111 0.032 0.110 0.179 0.265 0.435 1.003 910 #> Theta[1,6] 0.075 0.064 0.004 0.027 0.058 0.103 0.236 1.001 1500 #> Theta[1,7] 0.026 0.031 0.000 0.005 0.016 0.035 0.114 1.004 1500 #> Theta[1,8] 0.010 0.015 0.000 0.001 0.004 0.012 0.054 1.008 1500 #> Theta[1,9] 0.004 0.008 0.000 0.000 0.001 0.004 0.027 1.011 1500 #> Theta[1,10] 0.002 0.004 0.000 0.000 0.000 0.001 0.013 1.004 1500 The probabilities should sum to 1 for each release group. Samples from the posterior are also included in the sims.matrix, sims.array and sims.list elements of the results object. It is always important to do model assessment before accepting the results from the model fit. Please contact me for details on how to interpret the goodness of fit, trace, and autocorrelation plots. 5 Fitting the BTSPAS non-diagonal model with a non-parametric movement distribution. Bonner and Schwarz (2011) developed a model with the following features. • Non-parametric distribution of the distribution of times between release and availability at the second trap. • A spline is used to smooth the total number of unmarked fish presenting themselves at the second trap over the strata • A hierarchical model for the capture-probabilities is assumed where individual stratum capture probabilities are assumed to vary around a common mean. The model also allows the user to use covariates to explain some of the variation in the capture probabilities in much the same way as the diagonal case. The $$BTSPAS$$ package also has additional features and options: • if $$u2$$ is missing for any stratum, the program will use the spline to interpolate for the number of unmarked fish in the population missing stratum. • if $$n1$$ and/or the entire corresponding row of $$m2$$ are 0, the program will use the hierarchical model to interpolate the capture probabilities for the missing strata. This is often useful when releases did not take place in a stratum (e.g. trap not running) or something went wrong in that stratum of release. • the program allows you specify break points in the underlying spline to account for external events. • sometimes bad thing happen. The vector $$bad.m2$$ indicates which julian weeks something went wrong. In the above example, the number of recoveries in julian week 41 is far below expectations and leads to impossible Petersen estimate for julian week 41. Similarly, the vector $$bad.u2$$ indicates which julian weeks, the number of unmarked fish is suspect. In both cases, the suspect values of $$n1$$ and $$m2$$ are set to 0 and the suspect values in $$u2$$ are set to missing. Alternatively, the user can set the $$n1$$ and $$m2$$ to zero and $$u2$$ values to missing in the data input directly. I arbitrarily chose the third julian week to demonstrate this feature. The $$BTSPAS$$ function also allows you specify • The prefix is used to identify the output files for this run. • The title is used to title the output. • Various parameters to control the Bayesian MCMC phase of model fitting. Please contact us for help in setting these if problem arise. We already read in the data above. Here we set the rest of the parameters. Don’t forget to set the working directory as appropriate library("BTSPAS") # After which weeks is the spline allowed to jump? demo2.jump.after <- c() # julian weeks after which jump occurs # Which julian weeks have "bad" recapture values. These will be set to 0 or missing prior to the model fit. demo2.bad.m2 <- c() # list julian weeks with bad m2 values. This is used in the Trinity Example demo2.bad.u2 <- c() # list julian weeks with bad u2 values. [This was arbitrary to demostrate the feature.] demo2.bad.n1 <- c() # list julian weeks with bad n1 values. [This was arbitrary to demonstrate the feature.] # The prefix for the output files: demo2.prefix <- "demo2-1987-Conne River-TSP NDE NP" # Title for the analysis demo2.title <- "Conne River 1987 Atlantic Salmon Smolts - Non-parametric" cat("*** Starting ",demo2.title, "\n\n") #> *** Starting Conne River 1987 Atlantic Salmon Smolts - Non-parametric # Make the call to fit the model and generate the output files demo2.fit <- TimeStratPetersenNonDiagErrorNP_fit( # notice change in function name title= demo2.title, prefix= demo2.prefix, time= min(demo.data$jday):(min(demo.data$jday)+length(demo.data.u2)-1), n1= demo.data$n1,
m2=         demo.data.m2,
u2=         demo.data.u2,
jump.after= demo2.jump.after,
debug=TRUE,             # save time by reducing number of MCMC iterations
save.output.to.files=FALSE)
#>
#>
#> *** Start of call to JAGS
#> Working directory:  /Users/cschwarz/Dropbox/SPAS-Bayesian/BTSPAS/vignettes
#> Initial seed for JAGS set to: 498821
#> Random number seed for chain  547249
#> Random number seed for chain  750762
#> Random number seed for chain  589584
#> Compiling model graph
#>    Resolving undeclared variables
#>    Allocating nodes
#> Graph information:
#>    Observed stochastic nodes: 92
#>    Unobserved stochastic nodes: 297
#>    Total graph size: 3151
#>
#> Initializing model
#>
#>
|
|                                                  |   0%
|
|++                                                |   4%
|
|++++                                              |   8%
|
|++++++                                            |  12%
|
|++++++++                                          |  16%
|
|++++++++++                                        |  20%
|
|++++++++++++                                      |  24%
|
|++++++++++++++                                    |  28%
|
|++++++++++++++++                                  |  32%
|
|++++++++++++++++++                                |  36%
|
|++++++++++++++++++++                              |  40%
|
|++++++++++++++++++++++                            |  44%
|
|++++++++++++++++++++++++                          |  48%
|
|++++++++++++++++++++++++++                        |  52%
|
|++++++++++++++++++++++++++++                      |  56%
|
|++++++++++++++++++++++++++++++                    |  60%
|
|++++++++++++++++++++++++++++++++                  |  64%
|
|++++++++++++++++++++++++++++++++++                |  68%
|
|++++++++++++++++++++++++++++++++++++              |  72%
|
|++++++++++++++++++++++++++++++++++++++            |  76%
|
|++++++++++++++++++++++++++++++++++++++++          |  80%
|
|++++++++++++++++++++++++++++++++++++++++++        |  84%
|
|++++++++++++++++++++++++++++++++++++++++++++      |  88%
|
|++++++++++++++++++++++++++++++++++++++++++++++    |  92%
|
|++++++++++++++++++++++++++++++++++++++++++++++++  |  96%
|
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100%
#>
|
|                                                  |   0%
|
|**                                                |   4%
|
|****                                              |   8%
|
|******                                            |  12%
|
|********                                          |  16%
|
|**********                                        |  20%
|
|************                                      |  24%
|
|**************                                    |  28%
|
|****************                                  |  32%
|
|******************                                |  36%
|
|********************                              |  40%
|
|**********************                            |  44%
|
|************************                          |  48%
|
|**************************                        |  52%
|
|****************************                      |  56%
|
|******************************                    |  60%
|
|********************************                  |  64%
|
|**********************************                |  68%
|
|************************************              |  72%
|
|**************************************            |  76%
|
|****************************************          |  80%
|
|******************************************        |  84%
|
|********************************************      |  88%
|
|**********************************************    |  92%
|
|************************************************  |  96%
|
|**************************************************| 100%
#>
#>
#> *** Finished JAGS ***

The final parameter (save.output.to.files) can be set to automatically to save plots and reports in files with the appropriate prefix in the working directory.

5.1 The output from the fit

Here is the fitted spline curve to the number of unmarked fish available in each recovery sample

demo2.fit$plots$fit.plot

The distribution of the posterior sample for the total number unmarked and total abundance is available:

demo2.fit$plots$post.UNtot.plot

A plot of the $$logit(P)$$ is

demo2.fit$plots$logitP.plot

In cases where there is little information, $$BTSPAS$$ has shared information based on the distribution of catchability in the other strata.

A summary of the posterior for each parameter is also available. In particular, here are the summary statistics on the posterior sample for the total number unmarked and total abundance:

demo2.fit$summary[ row.names(demo2.fit$summary) %in% c("Ntot","Utot"),]
#>         mean       sd     2.5%      25%     50%      75%    97.5%     Rhat n.eff
#> Ntot 75206.6 3378.243 69084.17 72874.25 75089.5 77285.25 82345.89 1.002336   860
#> Utot 70231.6 3378.243 64109.17 67899.25 70114.5 72310.25 77370.89 1.002343   850

This also includes the Rubin-Brooks-Gelman statistic ($$Rhat$$) on mixing of the chains and the effective sample size of the posterior (after accounting for autocorrelation).

The estimated total abundance from $$BTSPAS$$ is 75,207 (SD 3,378 ) fish.

The estimated distribution function is allowed by vary by release stratum around a common “mean” distribution.

probs <- demo2.fit$summary[grepl("movep", row.names(demo2.fit$summary)),  ]
round(probs,3)
#>           mean    sd  2.5%   25%   50%   75% 97.5%  Rhat n.eff
#> movep[1] 0.020 0.008 0.009 0.015 0.019 0.024 0.038 1.002  1500
#> movep[2] 0.450 0.069 0.315 0.404 0.448 0.495 0.583 1.000  1500
#> movep[3] 0.398 0.061 0.270 0.359 0.398 0.439 0.519 1.000  1500
#> movep[4] 0.104 0.030 0.056 0.083 0.101 0.122 0.169 1.001  1500
#> movep[5] 0.028 0.012 0.011 0.019 0.026 0.035 0.057 1.000  1500

So we expect that about 2% of fish will migrate to the second trap in the day of release; about 45% of fish will migrate to the second trap in the second day after release etc.

The movement for each release stratum varies around this base distribution. It is also possible to see the probability of moving from release stratum $$i$$ to recovery stratum $$j$$ by looking at the $$Theta[i,j]$$ values. Here are the transition probabilities for the first release stratum:

round(demo2.fit$summary[ grepl("Theta[1,", row.names(demo2.fit$summary),fixed=TRUE),],3)
#>             mean    sd  2.5%   25%   50%   75% 97.5%  Rhat n.eff
#> Theta[1,1] 0.036 0.060 0.001 0.006 0.016 0.042 0.198 1.001  1500
#> Theta[1,2] 0.230 0.173 0.020 0.095 0.187 0.328 0.666 1.000  1500
#> Theta[1,3] 0.291 0.187 0.029 0.139 0.258 0.418 0.712 1.001  1500
#> Theta[1,4] 0.183 0.139 0.015 0.076 0.151 0.254 0.540 1.000  1500
#> Theta[1,5] 0.259 0.160 0.037 0.133 0.231 0.368 0.629 1.005  1500

The probabilities should also sum to 1 for each release group.

It is always important to do model assessment before accepting the results from the model fit. Please contact me for details on how to interpret the goodness of fit, trace, and autocorrelation plots.

6 Prior information on the movement probabilities.

It is possible to impose prior information on the movement probabilities in both cases. This would be useful in cases with VERY sparse data!

In the non-parametric case, specify a vector that gives the relative weight of belief of movement. These are similar to a Dirchelet-type prior where the values representing belief in the distribution of travel times. For example, $$prior.muTT=c(1,4,3,2)$$ represents a system where the maximum travel time is 3 strata after release with $$1/10=.1$$ of the animals moving in the stratum of release $$4/10=.4$$ of the animals taking 1 stratum to move etc So if $$prior.muTT=c(10,40,30,20)$$, this represent the same movement pattern but a strong degree of belief because all of the numbers are larger. AN intuitive explanation is that the $$sum(prior.muTT)$$ represents the number of animals observed to make this travel time distribution.

Here we will fit a fairly strong prior on the movement probabilities:

demo3.prior.muTT=c(10,50,30,5,5)

where the probability of movement in the stratum of release and subsequent strata is

round(demo3.prior.muTT/sum(demo3.prior.muTT),2)
#> [1] 0.10 0.50 0.30 0.05 0.05

We already read in the data above. Here we set the rest of the parameters. Don’t forget to set the working directory as appropriate

library("BTSPAS")

# After which weeks is the spline allowed to jump?
demo3.jump.after <- c()  # julian weeks after which jump occurs

# Which julian weeks have "bad" recapture values. These will be set to 0 or missing as needed.
demo3.bad.m2     <- c()   # list julian weeks with bad m2 values. This is used in the Trinity Example
demo3.bad.u2     <- c()   # list julian weeks with bad u2 values. [This was arbitrary to demonstrate the feature.]
demo3.bad.n1     <- c()   # list julian weeks with bad n1 values. [This was arbitrary to demonstrate the feature.]

# The prefix for the output files:
demo3.prefix <- "demo3-1987-Conne River-TSP NDE NP- prior"

# Title for the analysis
demo3.title <- "Conne River 1987 Atlantic Salmon Smolts - Non-parametric - Strong Prior"

cat("*** Starting ",demo3.title, "\n\n")
#> *** Starting  Conne River 1987 Atlantic Salmon Smolts - Non-parametric - Strong Prior

# Make the call to fit the model and generate the output files
demo3.fit <- TimeStratPetersenNonDiagErrorNP_fit(  # notice change in function name
title=      demo3.title,
prefix=     demo3.prefix,
time=       min(demo.data$jday):(min(demo.data$jday)+length(demo.data.u2)-1),
n1=         demo.data$n1, m2= demo.data.m2, u2= demo.data.u2, prior.muTT= demo3.prior.muTT, # prior on moements jump.after= demo3.jump.after, bad.n1= demo3.bad.n1, bad.m2= demo3.bad.m2, bad.u2= demo3.bad.u2, debug=TRUE, # save time by reducing number of MCMC iterations save.output.to.files=FALSE) #> #> #> *** Start of call to JAGS #> Working directory: /Users/cschwarz/Dropbox/SPAS-Bayesian/BTSPAS/vignettes #> Initial seed for JAGS set to: 370989 #> Random number seed for chain 150406 #> Random number seed for chain 649586 #> Random number seed for chain 363192 #> Compiling model graph #> Resolving undeclared variables #> Allocating nodes #> Graph information: #> Observed stochastic nodes: 92 #> Unobserved stochastic nodes: 297 #> Total graph size: 3151 #> #> Initializing model #> #> | | | 0% | |++ | 4% | |++++ | 8% | |++++++ | 12% | |++++++++ | 16% | |++++++++++ | 20% | |++++++++++++ | 24% | |++++++++++++++ | 28% | |++++++++++++++++ | 32% | |++++++++++++++++++ | 36% | |++++++++++++++++++++ | 40% | |++++++++++++++++++++++ | 44% | |++++++++++++++++++++++++ | 48% | |++++++++++++++++++++++++++ | 52% | |++++++++++++++++++++++++++++ | 56% | |++++++++++++++++++++++++++++++ | 60% | |++++++++++++++++++++++++++++++++ | 64% | |++++++++++++++++++++++++++++++++++ | 68% | |++++++++++++++++++++++++++++++++++++ | 72% | |++++++++++++++++++++++++++++++++++++++ | 76% | |++++++++++++++++++++++++++++++++++++++++ | 80% | |++++++++++++++++++++++++++++++++++++++++++ | 84% | |++++++++++++++++++++++++++++++++++++++++++++ | 88% | |++++++++++++++++++++++++++++++++++++++++++++++ | 92% | |++++++++++++++++++++++++++++++++++++++++++++++++ | 96% | |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% #> | | | 0% | |** | 4% | |**** | 8% | |****** | 12% | |******** | 16% | |********** | 20% | |************ | 24% | |************** | 28% | |**************** | 32% | |****************** | 36% | |******************** | 40% | |********************** | 44% | |************************ | 48% | |************************** | 52% | |**************************** | 56% | |****************************** | 60% | |******************************** | 64% | |********************************** | 68% | |************************************ | 72% | |************************************** | 76% | |**************************************** | 80% | |****************************************** | 84% | |******************************************** | 88% | |********************************************** | 92% | |************************************************ | 96% | |**************************************************| 100% #> #> #> *** Finished JAGS *** The final parameter (save.output.to.files) can be set to automatically to save plots and reports in files with the appropriate prefix in the working directory. 6.1 The output from the fit Here is the fitted spline curve to the number of unmarked fish available in each recovery sample demo3.fit$plots$fit.plot The distribution of the posterior sample for the total number unmarked and total abundance is available: demo3.fit$plots$post.UNtot.plot A plot of the $$logit(P)$$ is demo3.fit$plots$logitP.plot In cases where there is little information, $$BTSPAS$$ has shared information based on the distribution of catchability in the other strata. A summary of the posterior for each parameter is also available. In particular, here are the summary statistics on the posterior sample for the total number unmarked and total abundance: demo3.fit$summary[ row.names(demo3.fit$summary) %in% c("Ntot","Utot"),] #> mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff #> Ntot 75620.95 3320.019 69371.25 73341.5 75421.5 77868.25 82213.2 1.000847 1500 #> Utot 70645.95 3320.019 64396.25 68366.5 70446.5 72893.25 77238.2 1.000844 1500 This also includes the Rubin-Brooks-Gelman statistic ($$Rhat$$) on mixing of the chains and the effective sample size of the posterior (after accounting for autocorrelation). The estimated total abundance from $$BTSPAS$$ is 75,621 (SD 3,320 ) fish. The estimated distribution function is allowed by vary by release stratum around a common “mean” distribution. probs <- demo3.fit$summary[grepl("movep", row.names(demo3.fit$summary)), ] round(probs,3) #> mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff #> movep[1] 0.043 0.010 0.027 0.036 0.042 0.049 0.065 1.003 710 #> movep[2] 0.495 0.040 0.418 0.467 0.493 0.521 0.576 1.001 1400 #> movep[3] 0.351 0.036 0.277 0.327 0.350 0.374 0.424 1.002 1100 #> movep[4] 0.082 0.018 0.051 0.070 0.080 0.094 0.123 1.000 1500 #> movep[5] 0.029 0.009 0.015 0.023 0.028 0.035 0.051 1.003 1500 So we expect that about 4% of fish will migrate to the second trap in the day of release; about 49% of fish will migrate to the second trap in the second day after release etc. The movement for each release stratum varies around this base distribution. It is also possible to see the probability of moving from release stratum $$i$$ to recovery stratum $$j$$ by looking at the $$Theta[i,j]$$ values. Here are the transition probabilities for the first release stratum: round(demo3.fit$summary[ grepl("Theta[1,", row.names(demo3.fit\$summary),fixed=TRUE),],3)
#>             mean    sd  2.5%   25%   50%   75% 97.5%  Rhat n.eff
#> Theta[1,1] 0.061 0.080 0.002 0.013 0.032 0.076 0.288 1.001  1500
#> Theta[1,2] 0.255 0.179 0.025 0.111 0.216 0.364 0.677 1.000  1500
#> Theta[1,3] 0.279 0.177 0.034 0.140 0.243 0.391 0.689 1.001  1500
#> Theta[1,4] 0.157 0.115 0.013 0.066 0.131 0.223 0.440 1.003  1300
#> Theta[1,5] 0.248 0.154 0.033 0.127 0.220 0.339 0.600 1.001  1400

The probabilities should also sum to 1 for each release group.

It is always important to do model assessment before accepting the results from the model fit. Please contact me for details on how to interpret the goodness of fit, trace, and autocorrelation plots.

7 References

Bonner, S. J., & Schwarz, C. J. (2011). Smoothing population size estimates for Time-Stratified Mark–Recapture experiments Using Bayesian P-Splines. Biometrics, 67, 1498–1507. https://doi.org/10.1111/j.1541-0420.2011.01599.x

Schwarz, C. J., & Dempson, J. B. (1994). Mark-recapture estimation of a salmon smolt population. Biometrics, 50, 98–108.