## 18.4 Other aspects of forecast uncertainty

There are other elements related to forecasting and taking uncertainty into account that we have not discussed in the previous sections. Here we discuss several special cases where forecasting approaches might differ from the conventional ones.

### 18.4.1 Constructing intervals for intermittent demand model

When it comes to constructing prediction intervals for the intermittent state space model (see Section 13), then there is an important aspect that should be taken into account. Given that the model consists of two parts: demand sizes and demand occurrence, the prediction interval should take the uncertainty from both parts into account. In this case, we should first predict the probability of occurrence of demand for the h steps ahead and then decide what the width of the interval should be based on this probability. For example, if we estimate that the demand will occur with probability \(\hat{p}_{t+h|t} = 0.8\), then this means that we expect that in 20% of the cases, we will observe zeroes. This should reduce the confidence level for the demand sizes. Formally speaking, this comes to the following equation: \[\begin{equation} F_y(y_{t+h} \leq q) = \hat{p}_{t+h|t} F_z(z_{t+h} \leq q) +(1 -\hat{p}_{t+h|t}), \tag{18.8} \end{equation}\] where \(F_y(\cdot)\) is the cumulative distribution function of demand, \(F_z(\cdot)\) is the cumulative distribution function of the demand sizes, \(\hat{p}_{t+h|t}\) is the h steps ahead expected probability of occurrence and \(q\) is the quantile of distribution. In the formula (18.8), we know the expected probability and we know the confidence level \(F_y(y_{t+h} \leq q)\). The unknown element is the \(1-\alpha = F_z(z_{t+h} \leq q)\). So after regrouping elements we get: \[\begin{equation} F_z(z_{t+h} \leq q) = \frac{F_y(y_{t+h} \leq q) -(1 -\hat{p}_{t+h|t})}{\hat{p}_{t+h|t}}. \tag{18.9} \end{equation}\] For example, if the confidence level is 0.95 and the expected probability of occurrence is 0.8, then \(F_z(z_{t+h} \leq q) = \frac{0.95 -0.2}{0.8} = 0.9375\). Assuming a distribution for the demand sizes, we can use formula (18.3) to construct the parametric prediction interval. Alternatively, we can use any other approach discussed in Section 18.3 to generate the intervals.

### 18.4.2 One-sided prediction interval

In some cases, we might not need both bounds of the interval. For example, when we deal with intermittent demand, we know that the lower bound will be equal to zero in many cases. In fact, this will always happen when the significance level is lower than the probability of inoccurrence \(1-p_{t+h|t}\): we will have the quantile equal to zero because the probability of having zeroes is higher than the significance level. Another example is the safety stock calculation: we only need the upper bound of the interval, and we need to make sure that the specific proportion of demand is satisfied (e.g. 95% of it). In these cases, we can just focus on the particular bound of the interval and drop the other one. Statistically speaking, this means that we cut only one tail of the assumed distribution. This has its implications and issues in several scenarios:

- When we are interested in
**upper bound**only and deal with**positive distribution**of demand (for example, Gamma, Log-Normal or Inverse Gaussian), we know that the demand will always lie between zero and the constructed bound. In cases of low volume (or even intermittent) data, this makes sense because the original data might contain zeroes or have values close to it. The upper bound in this case will be lower than in the case of the two-sided prediction interval because we would not be splitting the probability into two parts (for the left and the right tails); - The combination of
**lower bound**and**positive distribution**implies that the demand will be greater than the specified bound in the pre-selected number of cases (defined by confidence level). There is no natural bound from above, so from a theoretical point of view, this implies that the demand can be infinite; **Upper**or**lower**bound with**real-valued distribution**(such as Normal, Laplace, S or Generalised Normal) implies that the demand is either below or above the specified level, respectively, without any natural limit on the other side. If Normal distribution is used on positive low volume data, there is a natural lower bound, but the model itself will not be aware of it and will not restrict the space with the specific value.

### 18.4.3 Cumulative over the horizon forecast

Another related thing to consider when producing forecasts in practice is that the point forecast is not needed in some contexts. Instead, the cumulative over the forecast horizon (or over the lead time) might be more suitable. The classic example is the safety stock calculation based on the lead time (time between the order of product and its delivery). In this situation, we need to make sure that while the product is being delivered, we do not run out of stock, thus still satisfying the selected level of demand (e.g. 95%), but now over the whole period of time rather than on every separate observation.

In the case of **pure additive ADAM**, there are analytical formulae for the conditional expectations and conditional variance for this case that can be used in forecasting. These formulae come directly from the recursive relation (5.9) (for derivations in a simpler case, see for example, Hyndman et al. (2008) and Svetunkov and Petropoulos (2018)):
\[\begin{equation}
\begin{aligned}
\mu_{Y,t,h} = \text{E}(Y_{c,t,h}|t) = & \sum_{j=1}^h \sum_{i=1}^d \left(\mathbf{w}_{m_i}' \mathbf{F}_{m_i}^{\lceil\frac{j}{m_i}\rceil-1} \right) \mathbf{v}_{t} \\
\sigma^2_{Y,h} = \text{V}(Y_{c,t,h}|t) = & \left(1 + \sum_{k=1}^{h-1} \left(1+ (h-k) \sum_{i=1}^d \left(\mathbf{w}_{m_i}' \sum_{j=1}^{\lceil\frac{k}{m_i}\rceil-1} \mathbf{F}_{m_i}^{j-1} \mathbf{g}_{m_i} \mathbf{g}'_{m_i} (\mathbf{F}_{m_i}')^{j-1} \mathbf{w}_{m_i} \right) \right) \right) \sigma^2
\end{aligned},
\tag{18.10}\end{equation}\]
where \(Y_{c,t,h}=\sum_{j=1}^h y_{t+j}\) is the cumulative actual value and all the other variables have been defined in Section 5.2. Based on these expectation and variance, we can construct prediction interval as discussed in Section 18.3.

In cases of **multiplicative and mixed ADAM**, there are no closed forms for the conditional expectation and variance. As a result, simulations similar to the one discussed in Section 18.1 are needed to produce all possible paths for the next \(h\) steps ahead. The main difference would be that before taking the expectation or quantiles, the paths would need to be aggregated over the forecast horizon \(h\). This approach, together with the idea of a one-sided prediction interval, can be directly used to calculate the safety stock over the lead time.

### 18.4.4 Example in R

For the demonstration purposes, we consider an artificial intermittent demand example, similar to the one from Section 13.4:

```
<- ts(c(rpois(20,0.25), rpois(20,0.5), rpois(20,1),
y rpois(20,2), rpois(20,3)))
```

For simplicity, we apply iETS(M,Md,N) model with odds ratio occurrence:

```
<- adam(y, "MMdN", occurrence="odds-ratio",
adamiETSy h=7, holdout=TRUE)
plot(adamiETSy,7)
```

To make this setting closer to a possible real-life situation, we assume that the lead time is seven days, and we need to satisfy the 99% of demand for the last seven observations based on our model. Thus we produce the upper bound for the cumulative values for the confidence level of 99%:

```
<- forecast(adamiETSy, h=7,
adamiETSyForecast cumulative=TRUE,
interval="prediction",
side="upper")
```

Given that we deal with cumulative values, the basic plot will not be helpful, we should produce a different one (see Figure 18.8):

```
plot(sum(adamiETSy$holdout), ylab="Cumulative demand",
xlab="", xaxt="n", pch=16,
ylim=range(c(0, sum(adamiETSy$holdout),
$upper)))
adamiETSyForecastabline(h=adamiETSyForecast$mean, col="blue",
lwd=2)
abline(h=adamiETSyForecast$upper, col="grey",
lwd=2, lty=2)
```

What Figure 18.8 demonstrates is that for the holdout period of 7 days, the cumulative demand was around 26 units, while the upper bound of the interval was approximately 24. Based on that upper bound, we could place an order (based on what we already have in stock) and have an appropriate safety stock.

To see if the approach is suitable, we would need to apply it in either a rolling origin fashion (Section 2.4) or to a set of products to collect the distribution of related error measures.

### 18.4.5 Confidence interval

Finally, we can construct a confidence interval for some statistics. As discussed in Section 5.2 of Svetunkov (2022a), it can be built for the mean, a parameter, fitted values, etc. In our context, we might be interested in the confidence interval for the conditional h steps ahead expectation. This implies that we are interested in the uncertainty of the line, not of the actual values, which can only be constructed for the model that takes the uncertainty of parameters into account (as discussed in Section 16). The construction of confidence interval, in this case, relies on the normal distribution, based on Central Limit Theorem, as long as the basic assumptions for the model and CLT are satisfied (see Section 4.2 and Chapter 12 of Svetunkov, 2022a). Technically speaking, the construction of confidence interval comes to capturing the model uncertainty discussed in Chapter 16.

#### 18.4.5.1 Example in R

The only way how the confidence interval can be constructed for ADAM is via the `reforecast()`

function. Consider the example with ADAM ETS(A,Ad,N) on `BJSales`

data as in Section 18.3.8:

`<- adam(BJsales, h=10, holdout=TRUE) adamETSBJ `

The confidence interval for this model can be produced either directly via `reforecast()`

or via `forecast()`

, which will call it for you:

```
<- forecast(adamETSBJ, h=10,
adamETSBJConfidence interval="confidence", nsim=1000)
plot(adamETSBJConfidence)
```

Note that I have increased the number of iterations for the simulation to get a more accurate confidence interval around the conditional expectation. This will consume more memory, as the operation involves creating 1000 sample paths for the fitted values and another 1000 for the holdout sample forecasts.

Figure 18.9 shows the uncertainty around the point forecast based on the uncertainty of parameters of the model. As can be seen, the interval is narrow, demonstrating that the conditional expectation would not change much if the model’s parameters would vary slightly. The fact that the actual values are systematically above the forecast does not mean anything because the confidence interval does not consider the uncertainty of actual values.