This book is in Open Review. I want your feedback to make the book better for you and other readers. To add your annotation, select some text and then click the on the pop-up menu. To see the annotations of others, click the button in the upper right hand corner of the page

## 6.5 Distributional assumptions in pure additive ETS

While the conventional ETS assumes that the error term follows Normal distribution, ADAM ETS proposes some flexibility, implementing the following options for the error term distribution in the additive error models:

1. Normal: $$\epsilon_t \sim \mathcal{N}(0, \sigma^2)$$;
2. Laplace: $$\epsilon_t \sim \mathcal{Laplace}(0, s)$$;
3. S: $$\epsilon_t \sim \mathcal{S}(0, s)$$;
4. Generalised Normal: $$\epsilon_t \sim \mathcal{GN}(0, s, \beta)$$;

The conditional expectation and stability / forecastability conditions do not change for the model with these new assumptions. The main element that changes is the scale and the width of prediction intervals. Given that scales of these distributions are linearly related to the variance, one can calculate the conditional variance as discussed earlier and then use the formulae from the theory of distributions section in order to obtain the respective scales. Having the scales it becomes straightforward to calculate the needed quantiles for the prediction intervals.

The estimation of pure additive ETS models can be done via the maximisation of the likelihood of the assumed distribution, which in some cases coincide with the popular losses (e.g. Normal and MSE, or Laplace and MAE).

In addition, the following more exotic options for the additive error models are available in ADAM ETS:

1. Log Normal: $$\left(1+\frac{\epsilon_t}{\mu_{y,t}} \right) \sim \text{log}\mathcal{N}\left(-\frac{\sigma^2}{2}, \sigma^2\right)$$. Here $$\mu_{y,t} = \mathbf{w}' \mathbf{v}_{t-\mathbf{l}}$$, $$\sigma^2$$ is the variance of the error term in logarithms and the $$-\frac{\sigma^2}{2}$$ appears due to the restriction $$\text{E}(\epsilon_t)=0$$.
2. Inverse Gaussian: $$\left(1+\frac{\epsilon_t}{\mu_{y,t}} \right) \sim \mathcal{IG}(1, s)$$;
3. Gamma: $$\left(1+\frac{\epsilon_t}{\mu_{y,t}} \right) \sim \mathcal{\Gamma}(s^{-1}, s)$$;
The possibility of using these distributions arrises from a reformulation of the original pure additive model (6.2) into: \begin{aligned} {y}_{t} = &\mathbf{w}' \mathbf{v}_{t-\mathbf{l}}\left(1 + \frac{\epsilon_t}{\mathbf{w}' \mathbf{v}_{t-\mathbf{l}}}\right) \\ \mathbf{v}_{t} = &\mathbf{F} \mathbf{v}_{t-\mathbf{l}} + \mathbf{g} \epsilon_t \end{aligned}. \tag{6.19}

The connection between the two formulations becomes apparent, when opening the brackets in the measurement equation of (6.19). Note that in this case the model assumes that the data is strictly positive and while it might be possible to fit the model on the data with negative values, the calculation of the scale and the likelihood might become impossible. Using alternative losses (e.g. MSE) is a possible solution in this case.