The package *siland* aims to estimate the effects of landscape
and local variables on a measurement of interest. The method enable to
estimate effect intensities but also the scale of the effect for
landscape variables, without any a priori. The method is based on
estimation of Spatial Influence Function (SIF).The SIF function
describes how the influence of a pixel/cell of a landscape variable is
spatially distributed. We assume that the influence is maximal at the
pixel location and decreases with the distance. The decrease is
parametrised by one parameter which corresponds to the mean distance for
the SIF.

It is worth noting that the method simultaneously estimate intensities of effect and scales of effect using a numerical likelihood maximisation procedure. This means that the mean distances of the SIFs are not given by the user but estimated for each landscape variable.

*siland* package is designed to be a user-friendly tool: data
format are commonly used (data.frame and shape files from GIS data),
model expression is simple (classical formula as lm expression), various
outputs are very easily computed: summary of estimates and tests,
graphical maps of effects and graphical checking of likelihood
maximisation. Based on linear model, it includes numerous extensions of
linear model: interactions, random effects, GLM (binomial and poisson).
It is also possible to build multiannual model. We detailed hereafter
some examples of simple use of the *siland* package. (More
details about the method can be found in https://www.biorxiv.org/content/10.1101/692566v1).

*siland* package requires two data objects:

**data.frame of located observations**.*Data locations*have to be indicated in*columns “X” and “Y”*. The variable of interest as possible local variables are also included here.**sf object**containing description of landscape variables. It can be created using the`st_read`

function of the package`sf`

. For instance if your GIS data (in shp format) are in the*FILE*file, named*NAME*(i.e.FILE contains NAME.shp, NAME.dbf,…), you can easily import your data in landdata using the command:

`=st_read(dsn = "FILE",layer="NAME") landdata`

Here, we use the objects dataSiland and landSiland, included in
*siland* package. You can load data using the `data`

command.

```
library(siland)
## Le chargement a nécessité le package : sf
## Linking to GEOS 3.9.3, GDAL 3.5.2, PROJ 8.2.1; sf_use_s2() is TRUE
data(dataSiland)
data(landSiland)
```

*dataSiland* is a data.frame with columns: “obs”, variable of
interest and “x1” variable, a continuous local variable. The “Id” column
indicates the identification number of the plots where observations were
made. *landSiland* is an object of class sf where landscape is
characterised by two features named L1 and L2. Landscape is described by
polygons. For each polygon, L1 is equal to 1 if the polygon is of type
L1, 0 otherwise (so is it for L2).

```
str(dataSiland)
## 'data.frame': 100 obs. of 5 variables:
## $ X : num 8508 8508 6578 6578 1904 ...
## $ Y : num 7177 7177 3829 3829 1555 ...
## $ x1 : num 6.23 6.23 -3.91 -3.91 7.55 ...
## $ Id : num 1 1 2 2 3 3 4 4 5 5 ...
## $ obs: num 18.34 18.17 1.95 1.09 6.81 ...
str(landSiland)
## Classes 'sf' and 'data.frame': 4884 obs. of 3 variables:
## $ L1 : num 0 0 0 0 0 0 0 0 0 0 ...
## $ L2 : num 0 1 0 0 0 0 0 0 1 0 ...
## $ geometry:sfc_MULTIPOLYGON of length 4884; first list element: List of 1
## ..$ :List of 1
## .. ..$ : num [1:8, 1:2] 5993 5999 5978 5977 5975 ...
## ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
## - attr(*, "sf_column")= chr "geometry"
## - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA
## ..- attr(*, "names")= chr [1:2] "L1" "L2"
```

You can look at spatial configuration of the loaded data using the following commands :

```
library(ggplot2)
library(sf)
=st_geometry(landSiland[landSiland$L1==1,])#extract an sf object with only polygons of type L1
L1pol=st_geometry(landSiland[landSiland$L2==1,])#extract an sf object with only polygons of type L2
L2polggplot(landSiland)+
geom_sf(colour="grey",fill="white")+
geom_sf(data=L1pol,fill="red")+
geom_sf(data=L2pol,fill="blue")+
geom_point(data=dataSiland, aes(X,Y),col="green")
```

`siland()`

Estimations for landscape and local variables are based on
discretization with cells (or pixels) of the landscape. This
discretization is called a raster. Each unit (cell/pixel) of landscape
variable has an influence which is maximal at its location and decreases
as the distance increases. The decrease of this influence is described
by a spatial influence function (SIF). For modelling a given landscape
variable influence, we consider every cell where the landscape variable
is distributed, and computed its influence by summing spatial influence
of all cells. The spatial influence of a cell is determined by its
intensity (negative or positive) and its SIF, which is a density
function described by its mean distance (in *siland* package).
The *siland* package allows to estimate the intensity of effect
and the scale of effects i.e. here the mean distance of SIF. Estimating
effects of the local variable x1 and landscape variables L1 and L2 on
the variable of interest obs can be done with the command
`siland()`

(the argument `wd`

is presented
below):

```
=siland(obs~x1+L1+L2,land=landSiland,data=dataSiland,wd=30)
res## Local variables: x1
## Landscape variables: L1 L2
## Model: obs ~ x1 + L1 + L2
## Model0: obs ~ x1
```

```
res## Model: obs ~ x1 + L1 + L2
##
## Landscape variables: L1 L2
##
## Coefficients:
## (Intercept) x1 L1 L2 SIF.L1 SIF.L2
## 1.249 0.085 -11.933 29.400 113.489 190.179
##
## standard error: 1.123119
## AIC: 316.93 AIC (no landscape): 619.37
## (No landscape effect) p-value: <1e-16
```

Here the parameters related to the landscape variable L1 are
intensity `L1`

(-11.644) and the mean distance of its SIF,
`SIF.L1`

(111.189). The vector `res$paramSIF`

gives for each landscape variable the mean distance of estimated SIFs.
The data.frame `res$landcontri`

gives for each landscape
variable, the cumulative spatial influence received by each observation
(i.e. the sum of spatial influence received by each observation from all
cells of the landscape).
AIC`is AIC the of the model resB1 (Model: obs ~ x1 + L1 + L2) and`

AIC(no
landscape)`is the AIC with only local variable (Model0: obs ~ x1).`

(No landscape effect) p-value` is related to the test of landscape
global effect significativity, i.e. H0= ‘Model0: obs ~ x1’ vs H1=‘Model:
obs ~ x1 + L1 + L2’ (Likelihood ratio test).

Parameter estimates and tests can be obtained using the command
`summary`

:

```
summary(res)
## SIF parameters:
## SIF.L1 SIF.L2
## 113.4894 190.1787
##
## -- Tests are given conditionally to the best SIF parameters --
##
## Call:
## obs ~ x1 + L1 + L2
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.97160 -0.83136 0.02424 0.66240 2.68791
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.24886 0.17868 6.989 3.65e-10 ***
## x1 0.08501 0.01861 4.568 1.46e-05 ***
## L1 -11.93331 0.97165 -12.281 < 2e-16 ***
## L2 29.40033 0.68737 42.772 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for gaussian family taken to be 1.261396)
##
## Null deviance: 2965.92 on 99 degrees of freedom
## Residual deviance: 121.09 on 96 degrees of freedom
## AIC: 312.93
##
## Number of Fisher Scoring iterations: 2
```

Note that p.values are given conditionally to estimated parameters for SIFs.

The command `plotsiland.sif`

gives a representation of the
estimated SIF for landscape variables. The shape of the SIF could be
“exponential”, “gaussian” or “uniform”. The shape can be defined with
the argument sif in the function `siland()`

and is
exponential by default. The function `plotsiland.sif()`

allows to displays the estimated SIFs.

`plotsiland.sif(res)`

The command `plotsiland.land`

gives map of effects of the
landscape. By default, it gives the global contribution of all landscape
variables:

`plotsiland.land(x=res,land=landSiland,data=dataSiland)`

One can obtain a map for a specific landscape variable by specifying
its number with the argument `var`

:

```
plotsiland.land(x=res,land=landSiland,data=dataSiland,var=2)
## Distance computing...
## Contribution computing...
```

As with all numerical maximization procedures, optimization problems
may arise. The function `siland.lik()`

allows to point out
possible problems of optimization.

```
=siland.lik(res,land= landSiland,data=dataSiland,varnames=c("L1","L2"))
likres## Likelihood computing for L1
## Likelihood computing for L2
```

` likres`

On this graphic, the -Log-likelihood is represented in function of
buffer radii. The estimation is made by maximazing the likelihood
i.e. by minimizing the -Log-likelihood. The orange line indicates the
minimal value obtained during the estimation. The black line represents
the -loglikelihood for L1 when the buffer radius for L1 is set to given
values in seqd argument (`seqd=seq(2,200,length=10)`

) and the
other parameters are fixed to their estimation. The black dotted line
indicates the value of L1 buffer size estimated during the global
estimation procedure. The red continuous and dotted red lines simirlarly
indicates the -loglikelihood and the value of buffer size estimated for
L2. When minization correctly occurs, the minimal values of the profiled
-loglikelihoods are equal and equal the minimal value of the global
-Log-likelihood. This means that the minimums of continuous black and
red lines are on the orange line (and that dotted black and red lines
intersect the continuous black and red lines at their minimum,
respectively). If it is not the case, the minimizing procedure has
failed and it is necessary to proceed with a new estimation with
different initialisation values. This is possible using the argument
`init`

in function `siland`

. For instance, for
starting estimation from buffer sizes of 20 and 150 for L1 and L2
landscape variables, respectively, use the command
`siland(obs~x1+L1+L2,land=landSiland,data=dataSiland,init=c(20,150))`

.

The argument

`wd`

indicates the mesh size used to construct spatial unit (cells/pixels) of landscape variable. In fact, landscape variable described as polygon are discretized on grid during the procedure estimation (like raster). The choice of wd is a tradeoff between computing precision and computing time (and memory size). The smallest`wd`

is, the better are the precision but the longer the computing time is (and the larger the required memory size is). It is worth to note that estimated parameters can be very sensitive to this mesh size.**To obtain a reliable estimation, we recommand to ensure, after the estimation procedure, that**If not, it is recommended to proceed with a new estimation with a smaller`wd`

size is at least three times smaller than the smallest estimated SIF.`wd`

size.The argument

`maxD`

corresponds to a radius. For each observation, only pixels with a distance less than maxD are used for estimation. For a given observation, pixels with a distance greater than maxD have in fact a very low intensity and can be ignored.**To obtain a reliable estimation, we recommand to ensure, after the estimation procedure, that**`maxD`

size is at least three times greater than the greatest estimated SIF.The argument

`border`

indicates whether an observation receives the spatial influence of cells belonging to the same plot (i.e. the plot where the observation was made). If`border=T`

, its receives influence only from the border of the plot (so no influence from the cells of the plot). By default,`border = F`

(all cells are considered).`border`

is a vector that defines if`border`

is True or False for each landscape variable. If`border=T`

(resp.`border=F`

) then the argument`border`

is set to True (resp. False) for all landscape variables.The argument

`sif`

defines the family function of SIF, i.e. the form of decrease for the landscape influence. The family can be`exponential`

,`gaussian`

, or`uniform`

. If influence is uniform, it implies that there is no decrease of influence and influence is uniform around each pixel of the raster. Note that all landscape variables have the same form.

Random effect can be included using the syntax `(1| )`

.
Note that only local effect are concerned.

```
=siland(obs~x1+L1+L2+(1|Id),land=landSiland,data=dataSiland,wd=30)
res2## Local variables: x1 Id
## Landscape variables: L1 L2
## Model: obs ~ x1 + L1 + L2 + (1 | Id)
## Model0: obs ~ x1 + (1 | Id)
```

```
summary(res2)
## SIF parameters:
## SIF.L1 SIF.L2
## 113.4894 190.1787
##
## -- Tests are given conditionally to the best SIF parameters --
## Linear mixed model fit by maximum likelihood ['lmerMod']
##
## AIC BIC logLik deviance df.resid
## 314.7 330.3 -151.4 302.7 94
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.62434 -0.73048 0.01841 0.59069 2.34981
##
## Random effects:
## Groups Name Variance Std.Dev.
## Id (Intercept) 0.080 0.2828
## Residual 1.131 1.0635
## Number of obs: 100, groups: Id, 50
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 1.24886 0.18076 6.909
## x1 0.08501 0.01883 4.515
## L1 -11.93331 0.98297 -12.140
## L2 29.40033 0.69537 42.280
##
## Correlation of Fixed Effects:
## (Intr) x1 L1
## x1 0.201
## L1 -0.373 -0.107
## L2 -0.702 -0.252 0.055
```

To consider distributions of the variable of interest that differ
from Gaussian, use the option `family`

available in the
`siland()`

function. Family can be “gaussian”, “poisson” or
“binomial” and the associated link function are identity, log and logit
respectively.

Interaction between local and landscape variables using the syntax
`:`

or `*`

(as well as localxlocal interaction).
Interaction term modify the intensity of a landscape variable according
to the local variable value. But the scale of effect of the landscape
variable remains constant.

```
#Model with main and interaction effect
=siland(obs~x1*L1+L2,land=landSiland,data=dataSiland,wd=30)
res3## Local variables: x1
## Landscape variables: L1 L2
## Model: obs ~ x1 * L1 + L2
## Model0: obs ~ x1
```

```
res3## Model: obs ~ x1 * L1 + L2
##
## Landscape variables: L1 L2
##
## Coefficients:
## (Intercept) x1 L1 L2 x1:L1 SIF.L1
## 1.243 0.081 -11.999 29.377 0.100 110.550
## SIF.L2
## 190.241
##
## standard error: 1.128036
## AIC: 318.75 AIC (no landscape): 619.37
## (No landscape effect) p-value: <1e-16
```

```
.1=siland(obs~x1*L1+L2,land=landSiland,data=dataSiland,wd=30)
res3## Local variables: x1
## Landscape variables: L1 L2
## Model: obs ~ x1 * L1 + L2
## Model0: obs ~ x1
```

```
.1
res3## Model: obs ~ x1 * L1 + L2
##
## Landscape variables: L1 L2
##
## Coefficients:
## (Intercept) x1 L1 L2 x1:L1 SIF.L1
## 1.243 0.081 -11.999 29.377 0.100 110.550
## SIF.L2
## 190.241
##
## standard error: 1.128036
## AIC: 318.75 AIC (no landscape): 619.37
## (No landscape effect) p-value: <1e-16
```

It is possible to deal with data observed during several years,
i.e. data associated with several landscapes with the function
`siland`

. The two data objects required are then : * a
**list of ** data.frames of located observations, one for
each year. An important point is that **data.frames column names
have to be exactly the same and ordered in the same way.** * a
**list of ** sf object containing a landscape description
of each year.

Let us suppose we have two years of observations associated with two landscapes. Since the goal is only to show how to deal with such datasets, we take the same datastets for observations and for landscapes for the two years.

```
=landSiland
landSilandY1=landSiland
landSilandY2#landSilandY is a list with the landscape for each year
=list(year1=landSilandY1,year2=landSilandY2)
landSilandY=dataSiland
dataSilandY1=dataSiland
dataSilandY2$year=factor("2018")
dataSilandY1$year=factor("2019")
dataSilandY2head(dataSilandY1)
## X Y x1 Id obs year
## 1 8507.824 7176.949 6.226 1 18.336417 2018
## 2 8507.824 7176.949 6.226 1 18.167951 2018
## 3 6577.630 3828.792 -3.906 2 1.950956 2018
## 4 6577.630 3828.792 -3.906 2 1.088552 2018
## 5 1904.231 1554.608 7.549 3 6.807977 2018
## 6 1904.231 1554.608 7.549 3 4.441390 2018
head(dataSilandY2)
## X Y x1 Id obs year
## 1 8507.824 7176.949 6.226 1 18.336417 2019
## 2 8507.824 7176.949 6.226 1 18.167951 2019
## 3 6577.630 3828.792 -3.906 2 1.950956 2019
## 4 6577.630 3828.792 -3.906 2 1.088552 2019
## 5 1904.231 1554.608 7.549 3 6.807977 2019
## 6 1904.231 1554.608 7.549 3 4.441390 2019
=list(year1=dataSilandY1,year2=dataSilandY2)
dataSilandY#dataSilandY is a list with the observed data for each year
=siland(obs~year+x1+L1+L2, land = landSilandY,data=dataSilandY,wd=30)
resY## Local variables: year x1
## Landscape variables: L1 L2
## Model: obs ~ year + x1 + L1 + L2
## Model0: obs ~ year + x1
resY## Model: obs ~ year + x1 + L1 + L2
##
## Landscape variables: L1 L2
##
## Coefficients:
## (Intercept) year2019 x1 L1 L2 SIF.L1
## 1.249 0.000 0.085 -11.933 29.400 113.489
## SIF.L2
## 190.179
##
## standard error: 1.114446
## AIC: 621.85 AIC (no landscape): 1234.73
## (No landscape effect) p-value: <1e-16
```

```
resY## Model: obs ~ year + x1 + L1 + L2
##
## Landscape variables: L1 L2
##
## Coefficients:
## (Intercept) year2019 x1 L1 L2 SIF.L1
## 1.249 0.000 0.085 -11.933 29.400 113.489
## SIF.L2
## 190.179
##
## standard error: 1.114446
## AIC: 621.85 AIC (no landscape): 1234.73
## (No landscape effect) p-value: <1e-16
summary(resY)
## SIF parameters:
## SIF.L1 SIF.L2
## 113.4894 190.1787
##
## -- Tests are given conditionally to the best SIF parameters --
##
## Call:
## obs ~ year + x1 + L1 + L2
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.97160 -0.83136 0.02424 0.66240 2.68791
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.249e+00 1.481e-01 8.434 7.4e-15 ***
## year2019 -2.679e-15 1.576e-01 0.000 1
## x1 8.501e-02 1.306e-02 6.510 6.2e-10 ***
## L1 -1.193e+01 6.818e-01 -17.504 < 2e-16 ***
## L2 2.940e+01 4.823e-01 60.960 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for gaussian family taken to be 1.241989)
##
## Null deviance: 5931.84 on 199 degrees of freedom
## Residual deviance: 242.19 on 195 degrees of freedom
## AIC: 617.85
##
## Number of Fisher Scoring iterations: 2
```

It is possible to deal with multisite data, using the same procedure considering different sites instead of different years.

As for an object ot type GLM, functions AIC(), residuals() and fitted() are available. In fact, conditionnaly to the estimated buffers or SIFs, the fitted model is a GLM or a LMM (Linear Mixed Model) or a GLMM (Generalized Linear Mixed Model). So after an estimation, it is possible to analyse more precisely the estimated model with the object result stored in the output. Object result is an an object from glm() or lmer() or glmer() functions.

```
summary(res$result)
##
## Call:
## glm(formula = model, family = family, data = newdata)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.97160 -0.83136 0.02424 0.66240 2.68791
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.24886 0.17868 6.989 3.65e-10 ***
## x1 0.08501 0.01861 4.568 1.46e-05 ***
## L1 -11.93331 0.97165 -12.281 < 2e-16 ***
## L2 29.40033 0.68737 42.772 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for gaussian family taken to be 1.261396)
##
## Null deviance: 2965.92 on 99 degrees of freedom
## Residual deviance: 121.09 on 96 degrees of freedom
## AIC: 312.93
##
## Number of Fisher Scoring iterations: 2
BIC(res$result)
## [1] 325.9532
fitted(res$result)[1:10]
## 1 2 3 4 5 6 7 8
## 17.515084 17.515084 1.931324 1.931324 6.116991 6.116991 -3.618455 -3.618455
## 9 10
## 2.579123 2.579123
residuals(res$result)[1:10]
## 1 2 3 4 5 6
## 0.82133235 0.65286713 0.01963192 -0.84277194 0.69098527 -1.67560150
## 7 8 9 10
## 1.01085931 -1.20498043 0.12104466 1.27277387
class(res$result)
## [1] "glm" "lm"
class(res$result)
## [1] "glm" "lm"
```