The gratis package indicates generating time series with diverse and controllable characteristic. It is a new efficient and general approach, based on gaussian mixture autoregressive (MAR) models to generate a wide range of non-gaussian and nonlinear time series.

Our generated dataset can be used as diversifiable and controllable benchmarking data in the time series domain. And it can apply as an algorithm evaluation tool for tasks such as time series forecasting and classification with a minimal input of human efforts and computational resources.

Based on simulate time series data with mixture autoregressive model, gratis can coverage generalise time series and investigate the diversity in a time series feature space.

Furthermore, by tuning parameters of mixture autoregressive model, gratis can also efficiently generate new time series and controllable features.

```
# load package
library(gratis)
```

We use function **generate_ts()** to generate diverse time series

Our generation process use **distributions** instead of fixed parameter values in underlying models to allow generate diverse time series instances. The diversity of the generated time series should not rely on the parameter settings.

**Definitions**

Here are the definitions of parameter settings in function generate_ts():

parameter settings | Definition |
---|---|

n.ts | number of time series to be generated |

freq | seasonal period of the time series to be generated |

nComp | number of mixing components when simulating time series using MAR models |

n | length of the generated time series |

**Example**

Suppose we want to use MAR model to generate **3** time series from random parameter spaces. Each time series has **12** seasonal periods, **2** mixing components and the length **120**.

By setting the parameter **output_format**, **generate_ts** now has an option to transform their time series output into a tsibble format. Without setting the parameter, it would keep output as default setting, list format.

**1.Generate diverse time series with ‘tsibble’ output format**

`generate_ts(n.ts = 3, freq = 12, nComp = 2, n = 120, output_format = "tsibble")`

```
#> $N1
#> # A tsibble: 120 x 2 [1M]
#> index value
#> <mth> <dbl>
#> 1 0001 Jan 2.00
#> 2 0001 Feb 6.59
#> 3 0001 Mar 3.71
#> 4 0001 Apr 5.78
#> 5 0001 May 3.68
#> 6 0001 Jun 8.48
#> 7 0001 Jul 6.02
#> 8 0001 Aug 10.7
#> 9 0001 Sep 10.6
#> 10 0001 Oct 12.1
#> # ... with 110 more rows
#>
#> $N2
#> # A tsibble: 120 x 2 [1M]
#> index value
#> <mth> <dbl>
#> 1 0001 Jan -2.19
#> 2 0001 Feb -4.79
#> 3 0001 Mar -5.54
#> 4 0001 Apr -5.74
#> 5 0001 May -2.01
#> 6 0001 Jun 2.51
#> 7 0001 Jul -0.510
#> 8 0001 Aug 0.411
#> 9 0001 Sep -0.282
#> 10 0001 Oct -4.61
#> # ... with 110 more rows
#>
#> $N3
#> # A tsibble: 120 x 2 [1M]
#> index value
#> <mth> <dbl>
#> 1 0001 Jan 58.9
#> 2 0001 Feb 63.8
#> 3 0001 Mar 66.7
#> 4 0001 Apr 71.6
#> 5 0001 May 76.0
#> 6 0001 Jun 77.6
#> 7 0001 Jul 83.3
#> 8 0001 Aug 81.9
#> 9 0001 Sep 90.5
#> 10 0001 Oct 94.0
#> # ... with 110 more rows
```

*Output*

We can see **3** different time series be simulated, which are **N1**, **N2** and **N3**. In this example we use time series **N1** for further analysis.

As required, there are **2** mixing components when simulating time series using MAR models, which are **pars1** and **pars2**

Each component stands for different weight.

**2. Generate diverse time series with ‘list’ output format**

```
generate_ts(n.ts = 3, freq = 12, nComp = 2, n = 120,output_format = "list")
x <-# N1 time series
$N1$pars x
```

```
#> $pars1
#> [1] 0.2199556 1.0722465 0.3962213 -0.3677716
#>
#> $pars2
#> [1] -0.5187467 0.4471215 1.0660018 -0.8475915
#>
#> $weights
#> [1] 0.7335474 0.2664526
```

*Plot time series*

```
# plot N1 time series
autoplot(x$N1$x)
```

Time series can exhibit multiple seasonal pattern of different length, especially when series observed at a high frequency such as daily or hourly data.

We use function **generate_msts()** to generate mutiple seasonal time series.

**Definitions**

Here are the definitions of parameter settings in function generate_msts():

parameter settings | Definition |
---|---|

seasonal.periods | a vector of seasonal periods of the time series to be generated |

nComp | number of mixing components when simulating time series using MAR models |

n | length of the generated time series |

**Example**

Suppose we want to use MAR model to generate a time series with **2** mixing components and the length **800** from random parameter spaces. Particularly, this time series has two seasonal periods **7** and **365**.

By setting the parameter **output_format**, **generate_msts** now has an option to transform their time series output into a tsibble format. Without setting the parameter, it would keep output as default setting, list format.

**1. Generate mutiple seasonal time series with ‘tsibble’ output format**

`generate_msts(seasonal.periods = c(7, 365), n = 800, nComp = 2,output_format="tsibble")`

```
#> # A tsibble: 800 x 2 [9.99999999999699e-07]
#> index value
#> <dbl> <dbl>
#> 1 1 -1.32
#> 2 1.00 -0.891
#> 3 1.01 -0.800
#> 4 1.01 -1.21
#> 5 1.01 -0.894
#> 6 1.01 -1.48
#> 7 1.02 -1.18
#> 8 1.02 -1.24
#> 9 1.02 -1.07
#> 10 1.02 -1.14
#> # ... with 790 more rows
```

**2. Generate mutiple seasonal time series with ‘list’ output format**

` generate_msts(seasonal.periods = c(7, 365), n = 800, nComp = 2,output_format="list") x <-`

*Plot time series*

`autoplot(x)`

Time series analysis with particular focus may only interested in a certain area of feature space or a subset of features.

Our function **generate_ts_with_target()** can efficiently generate time series with target features.

The principle behind is that we use genetic algorithms to tune MAR parameters until the distance between target feature vector and feature vector of a sample of time series simulated from MAR is approximately equal to 0.

**Definitions**

Here are the definitions of parameter settings in function generate_ts_with_target ():

parameter settings | Definition |
---|---|

n | number of time series to be generated |

ts.length | length of the time series to be generated |

freq | frequency of the time series to be generated |

seasonal | 0 for non-seasonal data, 1 for single-seasonal data, and 2 for multiple seasonal data |

features | a vector of function names |

selected.features | selected features to be controlled |

target | target feature values |

parallel | An optional argument which allows to specify if the Genetic Algorithm should be run sequentially or in parallel |

**Example**

Suppose we want to use MAR model to generate **1 non-seasonal data** time series with frequency **1** and the length **60**. Particularly, this time series has two selected features, **entropy** and **trend** with target value between **0.6** to **0.9**

By setting the parameter **output_format**, **generate_ts_with_target** now has an option to transform their time series output into a tsibble format. Without setting the parameter, it would keep output as default setting, list format.

**1. Generate mutiple seasonal time series with ‘tsibble’ output format**

```
generate_ts_with_target(
n = 1, ts.length = 60, freq = 1, seasonal = 0,
features = c('entropy', 'stl_features'),
selected.features = c('entropy', 'trend'),
target = c(0.6, 0.9),
parallel=FALSE,
output_format = "tsibble"
)
```

`#> GA | iter = 1 | Mean = -23.54402483 | Best = -0.04530665`

```
#> # A tsibble: 60 x 2 [1]
#> index value
#> <dbl> <dbl>
#> 1 1 -0.875
#> 2 2 -0.0970
#> 3 3 -0.485
#> 4 4 -2.50
#> 5 5 -3.00
#> 6 6 -1.88
#> 7 7 -0.995
#> 8 8 0.303
#> 9 9 -0.350
#> 10 10 -0.518
#> # ... with 50 more rows
```

**2. Generate mutiple seasonal time series with ‘list’ output format**

```
generate_ts_with_target(
x <-n = 1, ts.length = 60, freq = 1, seasonal = 0,
features = c('entropy', 'stl_features'),
selected.features = c('entropy', 'trend'),
target = c(0.6, 0.9),
parallel=FALSE,
output_format = "list"
)
```

`#> GA | iter = 1 | Mean = -16.92329368 | Best = -0.08841775`

*Plot time series*

`autoplot(x)`