`runner`

an R package for running operations.Package contains standard running functions (aka. rolling) with additional options like varying window size, lagging, handling missings and windows depending on date. `runner`

brings also rolling streak and rolling which, what extends beyond range of functions already implemented in R packages. This package can be successfully used to manipulate and aggregate time series or longitudinal data.

Install package from from GitHub or from CRAN.

`runner`

package provides functions applied on running windows. The most universal function is `runner::runner`

which gives user possibility to apply any R function `f`

in running window. In example below 4-months correlation is calculated lagged by 1 month.

```
library(runner)
x <- data.frame(
date = seq.Date(Sys.Date(), Sys.Date() + 365, length.out = 20),
a = rnorm(20),
b = rnorm(20)
)
runner(
x,
lag = "1 months",
k = "4 months",
idx = x$date,
f = function(x) {
cor(x$a, x$b)
}
)
```

There are different kinds of running windows and all of them are implemented in `runner`

.

Following diagram illustrates what running windows are - in this case running windows of length `k = 4`

. For each of 15 elements of a vector each window contains current 4 elements.

`k`

denotes number of elements in window. If `k`

is a single value then window size is constant for all elements of x. For varying window size one should specify `k`

as integer vector of `length(k) == length(x)`

where each element of `k`

defines window length. If `k`

is empty it means that window will be cumulative (like `base::cumsum`

). Example below illustrates window of `k = 4`

for 10th element of vector `x`

.

`lag`

denotes how many observations windows will be lagged by. If `lag`

is a single value than it is constant for all elements of x. For varying lag size one should specify `lag`

as integer vector of `length(lag) == length(x)`

where each element of `lag`

defines lag of window. Default value of `lag = 0`

. Example below illustrates window of `k = 4`

lagged by `lag = 2`

for 10-th element of vector `x`

. Lag can also be negative value, which shifts window forward instead of backward.

Sometimes data points in dataset are not equally spaced (missing weekends, holidays, other missings) and thus window size should vary to keep expected time frame. If one specifies `idx`

argument, than running functions are applied on windows depending on date. `idx`

should be the same length as `x`

of class `Date`

or `integer`

. Including `idx`

can be combined with varying window size, than k will denote number of periods in window different for each data point. Example below illustrates window of size `k = 5`

lagged by `lag = 2`

. In parentheses ranges for each window.

```
idx <- Sys.Date() + c(4, 6, 7, 13, 17, 18, 18, 21, 27, 31, 37, 42, 44, 47, 48)
runner(
x = 1:15,
k = "5 days",
lag = "1 days",
idx = idx
)
```

Runner by default returns vector of the same size as `x`

unless one puts any-size vector to `at`

argument. Each element of `at`

is an index on which runner calculates function. Below illustrates output of runner for `at = c(18, 27, 45, 31)`

which gives windows in ranges enclosed in square brackets. Range for `at = 27`

is `[22, 26]`

which is not available in current indices.

```
idx <- c(4, 6, 7, 13, 17, 18, 18, 21, 27, 31, 37, 42, 44, 47, 48)
runner(
x = idx,
k = 5,
lag = 1,
idx = idx,
at = c(18, 27, 48, 31)
)
```

`NA`

paddingUsing `runner`

one can also specify `na_pad = TRUE`

which would return `NA`

for any window which is partially out of range - meaning that there is no sufficient number of observations to fill the window. By default `na_pad = FALSE`

, which means that incomplete windows are calculated anyway. `na_pad`

is applied on normal cumulative windows and on windows depending on date. In example below two windows exceed range given by `idx`

so for these windows are empty for `na_pad = TRUE`

. If used sets `na_pad = FALSE`

first window will be empty (no single element within `[-2, 3]`

) and last window will return elements within matching `idx`

.

```
idx <- c(4, 6, 7, 13, 17, 18, 18, 21, 27, 31, 37, 42, 44, 47, 48)
runner(
x = idx,
k = 5,
lag = 1,
idx = idx,
at = c(4, 18, 48, 51),
na_pad = TRUE
)
```

`data.frame`

User can also put `data.frame`

into `x`

argument and apply functions which involve multiple columns. In example below we calculate beta parameter of `lm`

model on 1, 2, …, n observations respectively. On the plot one can observe how `lm`

parameter adapt with increasing number of observation.

```
date <- Sys.Date() + cumsum(sample(1:3, 40, replace = TRUE)) # unequaly spaced time series
x <- cumsum(rnorm(40))
y <- 30 * x + rnorm(40)
df <- data.frame(date, y, x)
slope <- runner(
df,
k = 10,
idx = "date",
function(x) {
coefficients(lm(y ~ x, data = x))[2]
}
)
plot(slope)
abline(h = 30, col = "blue")
```

The `runner`

function can also compute windows in parallel mode. The function doesn’t initialize the parallel cluster automatically but one have to do this outside and pass it to the `runner`

through `cl`

argument.

```
library(parallel)
#
numCores <- detectCores()
cl <- makeForkCluster(numCores)
runner(
x = df,
k = 10,
idx = "date",
f = function(x) sum(x$x),
cl = cl
)
stopCluster(cl)
```

With `runner`

one can use any R functions, but some of them are optimized for speed reasons. These functions are:

- aggregating functions - `length_run`

, `min_run`

, `max_run`

, `minmax_run`

, `sum_run`

, `mean_run`

, `streak_run`

- utility functions - `fill_run`

, `lag_run`

, `which_run`