18.1 Conditional expectation
As discussed in Sections 5.3 and 6.3, the conditional h steps ahead expectation is in general available only for the pure additive models. In the cases of pure multiplicative ones, the point forecasts would correspond to conditional expectations only for \(h \leq m\) for seasonal models and \(h=1\) for models with the trend. The one exception is the ETS(M,N,N) model, where the point forecast corresponds to the conditional expectation for any horizon as long as the expectation of the error term \(1+\epsilon_t\) is equal to one. When it comes to the mixed models (Section 7.2), the situation would depend on the specific model, but in general, the same logic as for the pure multiplicative ones applies. In this case, the conditional expectations need to be obtained via other means. This is when the simulations come into play.
18.1.1 Simulating demand trajectories
The general idea of this approach is to use the estimated parameters, last obtained state vector (level, trend, seasonal, ARIMA components etc.) and the estimate of the scale of distribution to generate the possible paths of the data. The simulation itself is done in several steps after obtaining parameters, states and scale:
- Generate \(h\) random variables for the error term, \(\epsilon_t\) or \(1+\epsilon_t\) – depending on the type of error, assumed distribution in the model (the latter was discussed in Sections 5.5 and 6.5) and estimated parameters of distribution (typically, scale);
- Insert the error terms in the state space model (7.1) from Section 7.1, both in the transition and observation parts, do that iteratively from \(j=1\) to \(j=h\): \[\begin{equation*} \begin{aligned} {y}_{t+j} = &w(\mathbf{v}_{t+j-\mathbf{l}}) + r(\mathbf{v}_{t+j-\mathbf{l}}) \epsilon_{t+j} \\ \mathbf{v}_{t+j} = &f(\mathbf{v}_{t+j-\mathbf{l}}) + g(\mathbf{v}_{t+j-\mathbf{l}}) \epsilon_{t+j} \end{aligned}, \end{equation*}\]
- Record actual values for \(j=\{1, \dots, h\}\);
- Repeat (1) - (3) \(n\) times;
- Take expectations for each horizon from 1 to \(h\).
Graphically, the result of this approach is shown in Figure 18.1, where each separate path is shown in grey colour and the expectation is in black. See the R code for this in Subsection 18.1.3.
Remark. In the case of multiplicative trend or multiplicative seasonality, it makes sense to take trimmed mean instead of the basic arithmetic one on the step (5). This is because the models with these components might exhibit explosive behaviour, and thus the expectation might become unrealistic. I suggest using 1% trimming, although this does not have any scientific merit and is only based on my personal expertise.
The simulation-based approach is universal, no matter what model is used, and can be applied to any ETS, ARIMA, regression model or combination (including dynamic ETSX, intermittent demand and multiple frequency models). Furthermore, instead of taking expectations on step 3, one can take geometric means, medians or any desired quantile. This is discussed later in Subsection 18.3.2.
The main issue with this approach is that the conditional expectation and any other statistics calculated based on this will differ with every new simulation run. If \(n\) is small, these values will be less stable (vary more with the new runs). But they will reach some asymptotic values with the increase of \(n\), staying random nonetheless. However, this is a good thing because this randomness reflects the uncertain nature of these statistics in the sample: the true values are never known, and the estimates will inevitably change with the sample size change. Another limitation is the computational time and memory usage: the more iterations we want to produce, the more calculations will need to be done, and more memory will be used. Luckily, time complexity in this situation should be linear: \(O(h \times n)\).
18.1.2 Explanatory variables
If the model contains explanatory variables, then the h steps ahead conditional expectations should use them in the calculation. The main challenge in this situation is that the future values might not be known in some cases. This has been discussed in Section 10.2. Practically speaking, if the user provides the holdout sample values of explanatory variables, the forecast.adam()
method will use them in forecasting. If they are not provided, the function will produce forecasts for each of the explanatory variables via the adam()
function and use the conditional h steps ahead expectations in forecasting.
18.1.3 Demonstration in R
In order to demonstrate how the approach works, we consider an artificial case of ETS(M,M,N) model with \(l_t=1000\), \(b_t=0.95\), \(\alpha=0.1\), \(\beta=0.01\), and Gamma distribution for error term with scale \(s=0.05\). We generate 1000 scenarios from this model for the horizon of \(h=10\) using sim.es()
function from smooth
package:
<- 1000
nsim <- 10
h <- 0.1
s <- c(1000,0.95)
initial <- c(0.1,0.01)
persistence <- sim.es("MMN", obs=h, nsim=nsim, persistence=persistence,
y initial=initial, randomizer="rgamma",
shape=1/s, scale=s)
After running the code above, we will obtain an object y
that will contain several variables, including y$data
with all the 1000 possible future trajectories. We can plot them to get an impression of what we are dealing with (see Figure 18.1):
plot(y$data[,1], ylab="Sales", ylim=range(y$data),
col=rgb(0.8,0.8,0.8,0.4), xlab="Horizon")
# Plot all the generated lines
for(i in 2:nsim){
lines(y$data[,i], col=rgb(0.8,0.8,0.8,0.4))
}# Add conditional mean and quantiles
lines(apply(y$data,1,mean))
lines(apply(y$data,1,quantile,0.025),
col="grey", lwd=2, lty=2)
lines(apply(y$data,1,quantile,0.975),
col="grey", lwd=2, lty=2)
18.1.3 Demonstration in R
In order to demonstrate the simulation approach, we consider an artificial case of ETS(M,M,N) model with \(l_t=1000\), \(b_t=0.95\), \(\alpha=0.1\), \(\beta=0.01\), and Gamma distribution for error term with scale \(s=0.05\). We generate 1000 scenarios from this model for the horizon of \(h=10\) using sim.es()
function from smooth
package:
nsim <- 1000
h <- 10
s <- 0.1
initial <- c(1000,0.95)
persistence <- c(0.1,0.01)
y <- sim.es("MMN", obs=h, nsim=nsim, persistence=persistence,
initial=initial, randomizer="rgamma",
shape=1/s, scale=s)
After running the code above, we will obtain an object y
that will contain several variables, including y$data
with all the 1000 possible future trajectories. We can plot them to get an impression of what we are dealing with (see Figure 18.2):
plot(y$data[,1], ylab="Sales", ylim=range(y$data),
col=rgb(0.8,0.8,0.8,0.4), xlab="Horizon")
# Plot all the generated lines
for(i in 2:nsim){
lines(y$data[,i], col=rgb(0.8,0.8,0.8,0.4))
}
# Add conditional mean and quantiles
lines(apply(y$data,1,mean))
lines(apply(y$data,1,quantile,0.025),
col="grey", lwd=2, lty=2)
lines(apply(y$data,1,quantile,0.975),
col="grey", lwd=2, lty=2)
Based on the plot in Figure 18.2, we can see what the conditional h steps ahead expectation (black line) and what the 95% prediction interval will be for the data based on the ETS(M,M,N) model with the parameters mentioned above.
Similar paths are produced by forecast()
function for adam
class in smooth
. If you need to extract them for further analysis, they are returned in the object scenarios
if the parameters interval="simulated"
and scenarios=TRUE
are set.