This vignette goes over the process of creating a quadtree and covers all of the various parameter settings that can be used to modify the way a quadtree is created. Note that all the examples use the following example raster:

Quadtrees are tree data structures where each node is allowed to have
either zero or four children. This data structure can be used to
represent two-dimensional space by interpreting each node as a
rectangular cell. The root of the tree represents a single cell that
encompasses the entire area. If a node has children, they represent the
four cells created when the node is divided into quadrants. This creates
a hierarchical data structure where any given point in space is
contained within many different cells at different levels. In order to
represent the value of some variable across space, each node has a value
that represents the value of that variable in the region covered by the
node. This means that any point may be associated with multiple values -
this can be useful in some situations. However, it may also be desirable
for each point to be associated with only a single value. The terminal
nodes can be treated as the only value for any given point, and the
values at the higher levels can essentially be ignored. This creates a
space-exhausting surface like a raster where every point in space has a
single value. Unlike a raster, however, the hierarchical nature of a
quadtree easily facilitates variable-sized cells. When a quadtree is
used in this fashion it is referred to as a *region
quadtree*.

This feature of quadtrees make them advantageous in certain situations. In some applications it may be critical for the data structure to have a fine resolution in certain areas, while other areas may be suitably represented by large cells. For example, quadtrees have been used for image compression - by allowing largely homogeneous areas of an image to be represented by large cells, the amount of data needed to represent the image can be greatly reduced while still preserving fine resolution in heterogeneous areas.

The following figure shows a quadtree data structure (left), it’s spatial representation (middle), and a 3D plot showing how the data structure and the spatial representation correspond (right).

In the `quadtree`

package, a quadtree is created from a
raster or a matrix by successively dividing the raster/matrix into
smaller and smaller cells, with the decision on whether to divide a
quadrant determined by a function that checks the cell values within
each quadrant and returns `TRUE`

if it should be split, and
`FALSE`

otherwise. Initially, all of the cells in the raster
are considered. If the cell values meet the condition determined by the
splitting function, the raster is divided into four quadrants -
otherwise, the raster is not divided further and the value of this
larger cell is calculated by applying a ‘combine function’ that
aggregates the cell values into a single value (for example, mean or
median). If a quadrant is split, the process is repeated for each of
those ‘child’ quadrants, and then for their children, and so on and so
forth, until either the split function returns `FALSE`

or the
smallest possible cell size (as determined by the input raster) has been
reached.

If a given quadrant has dimensions that are not divisible by two (for
example, 5 x 5), then the process stops. Because of this, only rasters
that have dimensions that are a power of two can be divided down to
their smallest cell size. In addition, only square rasters can be
divided down to their smallest cell size. To create quadtrees from
rasters that have dimensions that are not a power of two and are not
square, two options are provided. The choice of method is determined by
the `adj_type`

parameter.

`"expand"`

methodIn the `"expand"`

method, `NA`

cells are added
to the raster in order to create an expanded raster whose dimensions are
a power of two. The smallest number that is a power of two but greater
than the larger dimension is used as the dimensions of the expanded
raster. In the following example, the raster has dimensions of 178 x
161. To make it suitable for quadtree creation, `NA`

rows and
columns are added in order to create a raster with dimensions 256 x 256
(as 256 is the smallest power of two that is also greater than 178), and
then the quadtree is created from that raster.

`"resample"`

methodIn the `"resample"`

method, the raster is resampled in
order to create a square raster with dimensions that are a power of two.
If the data does not have the same number of rows and columns,
resampling the raster to have an equal number of rows and column will
result in rectangular but non-square cells. This resampled raster is
then used to create the quadtree.

```
qt <- quadtree(rast, .15, adj_type = "resample", resample_n_side = 128,
resample_pad_nas = FALSE)
plot(qt, border_lwd = .3, main = "resample (without NA padding)")
```

If square cells are desired, an additional step is added to make the
raster square by setting `resample_pad_nas`

to be
`TRUE`

(the default). This is done in a way similar to the
method described above. The smaller dimension is padded with
`NA`

cells in order to equal the larger dimension. As stated
previously, our sample raster has dimensions 178 x 161, so
`NA`

columns are added in order to create a raster with
dimensions 178 x 178. Then, the raster is resampled to a user-specified
dimension (determined by the `resample_n_side`

parameter).
For example, the user could set `resample_n_side`

to be 128,
which will resample the 178 x 178 raster to 128 x 128. This raster can
then be used to create a quadtree.

If `adj_type`

is `"none"`

, the provided
matrix/raster is used ‘as is’, with no dimension adjustment. In this
case, doing so results in a single cell quadtree, since 161 (the number
of columns) is not divisible by two.

The method used to determine whether or not to split a cell as well as the method used to aggregate cell values can be defined by the user. Simple methods are already provided, but custom functions can be defined.

Three methods are provided for splitting a quadrant. The
`"range"`

method calculates the difference between the
minimum and maximum values within the quadrant, `"sd"`

calculates the standard deviation of the values, and `"cv"`

calculates the coefficient of variation. Regardless of which method is
used, the resulting value is compared to the
`split_threshold`

parameter, and if it exceeds the threshold,
the quadrant is split.

```
qt_range <- quadtree(rast, .1, split_method = "range")
qt_sd <- quadtree(rast, .1, split_method = "sd")
qt_cv <- quadtree(rast, .1, split_method = "cv")
par(mfrow = c(1, 3), mar = c(0,0,3,0))
plot(qt_range, crop = TRUE, na_col = NULL, zlim = c(0, 1), border_lwd = .3,
axes = FALSE, legend = FALSE, main = "split_method = 'range'")
plot(qt_sd, crop = TRUE, na_col = NULL, zlim = c(0,1), border_lwd = .3,
axes = FALSE, legend = FALSE, main = "split_method = 'sd'")
plot(qt_cv, crop = TRUE, na_col = NULL, zlim = c(0,1), border_lwd = .3,
axes = FALSE, legend = FALSE, main = "split_method = 'cv'")
```

Four methods to aggregate cell values are provided -
`"mean"`

, `"median"`

, `"min"`

, and
`"max"`

- the names are self-explanatory. Note that in the
following example the structures of the quadtrees are identical - the
combine function has no influence on the decision to split a cell.

```
qt_mean <- quadtree(rast, .1, "sd", combine_method = "mean")
qt_median <- quadtree(rast, .1, "sd", combine_method = "median")
qt_min <- quadtree(rast, .1, "sd", combine_method = "min")
qt_max <- quadtree(rast, .1, "sd", combine_method = "max")
par(mfrow = c(2, 2), mar = c(.5, .5, .5, .5))
plot(qt_mean, crop = TRUE, na_col = NULL, axes = FALSE, legend = TRUE,
border_lwd = .3, zlim = c(0,1), main = "mean")
plot(qt_median, crop = TRUE, na_col = NULL, axes = FALSE, legend = TRUE,
border_lwd = .3, zlim = c(0,1), main = "median")
plot(qt_min, crop = TRUE, na_col = NULL, axes = FALSE, legend = TRUE,
border_lwd = .3, zlim = c(0,1), main = "min")
plot(qt_max, crop = TRUE, na_col = NULL, axes = FALSE, legend = TRUE,
border_lwd = .3, zlim = c(0,1), main = "max")
```

Custom functions can be written to apply more complex rules to
splitting and combining. These functions must take two parameters:
`vals`

and `args`

. `vals`

is a numeric
vector of the values of the cells within the current quadrant.
`args`

is a named list that contains the arguments needed by
the custom function. Any parameters needed for the function should be
accessed through `args`

. Note that even if no extra
parameters are needed, the custom function still needs to take an
`args`

parameter - in that case it just won’t be used by the
function.

`split_fun`

must return a boolean, where `TRUE`

indicates that the quadrant should be split. An important note to make
is that any custom function must be able to handle `NA`

values. The function must always return either `TRUE`

or
`FALSE`

- if `NA`

is ever returned an error will
occur.

For example, a simple splitting function that splits a quadrant when any of the values are below a given limit could be defined as follows:

We can then use this with the `quadtree()`

function to
apply the splitting method. Note that because the function makes use of
an element of `args`

named `threshold`

, the
`split_args`

parameter needs to contain an element called
`threshold`

:

As with `split_fun`

, `combine_fun`

must take
two arguments named `vals`

and `args`

. Custom
combine functions must return a single numeric value, and unlike
`split_fun`

, `combine_fun`

is allowed to return
`NA`

values. For example, the following function can be used
to create a binary quadtree which only takes on the values 0 and 1,
based on some cutoff threshold.

```
cmb_fun <- function(vals, args) {
if (any(is.na(vals))) {
return(NA)
}
if (mean(vals) < args$threshold) {
return(0)
} else {
return(1)
}
}
qt <- quadtree(rast, .1, combine_method = "custom", combine_fun = cmb_fun,
combine_args = list(threshold = .5))
plot(qt, crop = TRUE, na_col = NULL, border_lwd = .3,
main = "custom combine function")
```

The `quadtree()`

function also allows users to create a
quadtree using another quadtree as a “template” (via the
`template_quadtree`

parameter). The structure of the new
quadtree will be identical to that of the template quadtree, but the
values of the cells will be derived from the raster used to create the
new quadtree. This allows for a raster to be converted into a quadtree,
with the structure of the quadtree being determined by a different
raster. The rasters used to make the template quadtree and the new
quadtree should have the exact same extent and dimensions - in addition
the exact same expansion method (i.e. the method specified by
`adj_type`

) should be used to create both quadtrees.

In the following example, a raster representing the presence/absence of roads is used to create a quadtree, and then that quadtree is used as a template to create a quadtree from a different raster.

```
habitat_roads <- terra::rast(system.file("extdata", "habitat_roads.tif", package="quadtree"))
template <- habitat_roads
# use a custom function so that a quadrant is split if it contains any 1's
split_if_one <- function(vals, args) {
if(any(vals == 1, na.rm = TRUE)) return(TRUE)
return(FALSE)
}
qt_template <- quadtree(template, split_method = "custom",
split_fun = split_if_one)
# now use the template to create a quadtree from 'rast'
qt <- quadtree(rast, template_quadtree = qt_template)
par(mfrow = c(1, 3), mar = c(0,0,3,0))
plot(template, axes = FALSE, box = FALSE, legend = FALSE,
main = "template raster")
plot(qt_template, crop = TRUE, na_col = NULL, border_lwd = .3 ,axes = FALSE,
legend = FALSE, main = "template quadtree")
plot(qt, crop = TRUE, na_col = NULL, border_lwd = .3, axes = FALSE,
legend = FALSE, main = "quadtree created using template")
```

`max_cell_length`

and `min_cell_length`

The `max_cell_length`

and `min_cell_length`

parameters let the user specify the range of allowable cell sizes. If
`max_cell_length`

is not `NULL`

, then the maximum
cell size in the resulting quadtree will be
`max_cell_length`

. This essentially forces any quadrants
larger than `max_cell_length`

to split. The one exception is
that a quadrant that contains entirely `NA`

values will not
be split. Similarly, the `min_cell_length`

parameter can be
used to define a minimum side length for all cells, such that a quadrant
cannot be split if its children would be smaller than
`min_cell_length`

.

```
qt_max_cell <- quadtree(rast, .15, max_cell_length = 1000)
qt_min_cell <- quadtree(rast, .15, min_cell_length = 1000)
par(mfrow = c(1, 2))
plot(qt_max_cell, crop = TRUE, na_col = NULL, border_lwd = .3, legend = FALSE,
main = "max_cell_length = 1000")
plot(qt_min_cell, crop = TRUE, na_col = NULL, border_lwd = .3, legend = FALSE,
main = "min_cell_length = 1000")
```

`split_if_any_na`

and `split_if_all_na`

The `split_if_any_na`

and `split_if_all_na`

parameters control how `NA`

values are handled. If
`split_if_any_na`

is `TRUE`

(the default), a
quadrant will be split if any of the values are `NA`

. This
ensures that rasters with irregular shapes maintain their shape in the
resulting quadtree representation. If `FALSE`

, quadrants with
`NA`

s are not automatically split - note that this can
produce unexpected results if the raster is irregularly shaped (see the
example below). `split_if_all_na`

controls what happens when
a quadrant consists entirely of `NA`

values. If
`FALSE`

(the default), these quadrants are not split. If
`TRUE`

, these quadrants are automatically split, which
results in quadrants with all `NA`

values being split to the
smallest possible cell size.