---
title: "Rolling forecasts"
description: >
  Evaluate Echo State Network forecasts across multiple rolling windows
  using the M4 monthly data and tsibble functionality.
author: "Alexander Häußer"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Rolling forecasts}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r config, include = FALSE}
# Change default location and time
Sys.setlocale("LC_TIME", "C")

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.height = 5,
  fig.width = 7
)
```

## Introduction

This vignette demonstrates how to evaluate Echo State Network forecasts across
multiple rolling windows using the tidy interface of `echos`. Rolling forecast
evaluation provides a more robust assessment than a single train-test split
because the model is estimated and evaluated at several forecast origins.

The example uses two monthly time series from `m4_monthly_subset`. Fixed
training windows are created with `slide_tsibble()` from the `tsibble` package.
For every series and split, an ESN is fitted and used to generate an
18-month-ahead forecast.

## Load packages

```{r packages, message = FALSE, warning = FALSE}
library(echos)
library(tidyverse)
library(tsibble)
library(fable)
```

## Prepare the data

The dataset is filtered to the two time series `"M21655"` and `"M2717"`.

```{r data}
selected_series <- c("M21655", "M2717")

data_frame <- m4_monthly_subset %>%
  filter(series %in% selected_series)

data_frame
```

```{r data-plot, fig.alt = "Monthly observations for two M4 time series"}
p <- ggplot()

p <- p + geom_line(
  data = data_frame,
  aes(
    x = index,
    y = value),
  linewidth = 0.5
)

p <- p + facet_wrap(
  vars(series),
  ncol = 1,
  scales = "free_y"
)

p <- p + labs(
  x = "Time",
  y = "Value"
)

p
```

## Define the rolling forecast setup

Each ESN is trained on a fixed window of 180 monthly observations. The forecast
horizon is 18 months, corresponding to the horizon used for monthly series in
the M4 Forecasting Competition. The forecast origin advances by one month
between splits.

Five splits are used in this example. To reduce the vignette runtime, set
`n_splits` to a smaller value such as 3.

```{r setup}
n_train <- 180
n_ahead <- 18
n_step <- 1
n_splits <- 5

n_required <- n_train + n_ahead + (n_splits - 1) * n_step
```

Only the most recent observations required for the rolling evaluation are
retained. This ensures that both series contribute the same number of
observations and splits.

```{r analysis-data}
analysis_frame <- data_frame %>%
  group_by_key() %>%
  slice_tail(n = n_required) %>%
  ungroup()

analysis_frame
```

## Create rolling training windows

`slide_tsibble()` creates fixed rolling windows by observation. The new
variable `split` is added to the tsibble key and identifies the individual
training windows.

```{r rolling-windows}
train_frame <- analysis_frame %>%
  slide_tsibble(
    .size = n_train,
    .step = n_step,
    .id = "split"
  ) %>%
  filter(split <= n_splits)

train_frame
```

The following summary shows the start and end of every training window.

```{r split-summary}
split_frame <- train_frame %>%
  as_tibble() %>%
  summarise(
    train_start = min(index),
    train_end = max(index),
    n = n(),
    .by = c(series, split)
  )

split_frame
```

## Train the ESN models

The combination of `series` and `split` forms the key of `train_frame`.
Consequently, `model()` estimates one ESN for each series and rolling window.
With two series and five splits, ten ESN models are trained.

Because the ESN reservoir is initialized randomly, a seed is set to make the
results reproducible.

```{r models}
model_frame <- train_frame %>%
  model(
    "ESN" = ESN(value)
  )

model_frame
```

## Generate rolling forecasts

An 18-month-ahead forecast is generated for each fitted ESN.

```{r forecasts}
fable_frame <- model_frame %>%
  forecast(h = n_ahead)

fable_frame
```

## Evaluate forecast accuracy

The forecasts are evaluated against the corresponding observations in
`analysis_frame`. Accuracy measures are calculated separately for every series
and split.

```{r accuracy}
accuracy_frame <- fable_frame %>%
  accuracy(data = analysis_frame)

accuracy_frame
```

## Visualize the rolling forecasts

For each series and split, the point forecasts are compared with the observed
values over the 18-month evaluation period.

```{r forecast-data}
forecast_frame <- fable_frame %>%
  as_tibble() %>%
  transmute(
    series,
    split,
    index,
    FORECAST = .mean
  )

plot_frame <- forecast_frame %>%
  left_join(
    analysis_frame %>%
      as_tibble() %>%
      select(
        series,
        index,
        ACTUAL = value),
    by = c(
      "series",
      "index")) %>%
  pivot_longer(
    cols = c(ACTUAL, FORECAST),
    names_to = "type",
    values_to = "value"
    )

plot_frame
```

```{r forecast-plot, fig.width = 10, fig.height = 6, fig.alt = "Rolling forecasts and actual values for two M4 monthly time series"}
p <- ggplot()

p <- p + geom_line(
  data = plot_frame,
  aes(
    x = index,
    y = value,
    color = type),
  linewidth = 0.6
)

p <- p + facet_grid(
  rows = vars(series),
  cols = vars(split),
  scales = "free_y"
)

p <- p + scale_color_manual(
  values = c(
    "ACTUAL" = "grey35",
    "FORECAST" = "steelblue"
  )
)

p <- p + labs(
  title = "Rolling forecasts for M4 monthly time series",
  subtitle = "Fixed 180-month training windows and an 18-month forecast horizon",
  x = "Time",
  y = "Value",
  color = NULL
)

p <- p + theme(legend.position = "top")

p
```

## Summary

This example combines the tidy interface of `echos` with rolling-window
functionality from `tsibble`. The workflow consists of four main steps:

1. Create multiple fixed training windows with `slide_tsibble()`.
2. Estimate one ESN for each series and split with `model()`.
3. Generate forecasts with `forecast()`.
4. Evaluate the forecasts across splits with `accuracy()`.

The same workflow can be extended to more series, additional forecast origins,
or alternative models supported by the `fable` framework.