This book is in Open Review. I want your feedback to make the book better for you and other readers. To add your annotation, select some text and then click the on the pop-up menu. To see the annotations of others, click the button in the upper right hand corner of the page

# Chapter 11 Estimation of ADAM models

Now that we have discussed the properties of ETS, ARIMA, ETSX and ARIMAX models, we need to understand how to estimate them. As mentioned earlier, when we apply a model to the data, we assume that it is suitable and see how it fits the data and produces forecasts. In this case all the parameters in the model are substituted by the estimated ones (observed in sample) and the error term becomes an estimate of the true one. In general this means that the state space model (7.1) is substituted by: \begin{aligned} {y}_{t} = &w(\hat{\mathbf{v}}_{t-\boldsymbol{l}}) + r(\hat{\mathbf{v}}_{t-\boldsymbol{l}}) e_t \\ \hat{\mathbf{v}}_{t} = &f(\hat{\mathbf{v}}_{t-\boldsymbol{l}}) + \hat{g}(\hat{\mathbf{v}}_{t-\boldsymbol{l}}) e_t \end{aligned}, \tag{11.1} implying that the initial values of components and the smoothing parameters of the model are estimated. An example is the ETS(A,A,A) model applied to the data: \begin{aligned} y_{t} = & \hat{l}_{t-1} + \hat{b}_{t-1} + \hat{s}_{t-m} + e_t \\ \hat{l}_t = & \hat{l}_{t-1} + \hat{b}_{t-1} + \hat{\alpha} e_t \\ \hat{b}_t = & \hat{b}_{t-1} + \hat{\beta} e_t \\ \hat{s}_t = & \hat{s}_{t-m} + \hat{\gamma} e_t \end{aligned}, \tag{11.2} where the initial values $$\hat{l}_0, \hat{b}_0$$ and $$\hat{s}_{-m+2}, ... \hat{s}_0$$ are estimated and then influence all the future values of components via the recursion (11.2) and $$e_t = y_t -\hat{y}_t$$ is the one step ahead in sample forecast error, also known in statistics as the residual of the model.

The estimation itself does not happen on its own, a complicated process of minimisation / maximisation of the pre-selected loss function by changing the values of parameters is involved. Typically, there is no analytical solution for parameters of ADAM because of the recursive nature of the model. As a result numerical optimisation is used to obtain the estimates of parameters. The results of the estimation will differ depending on:

1. the assumed distribution,
2. the used loss function,
3. the initial values of parameters that are fed to the optimiser,
4. the parameters of the optimiser (such as sensitivity, number of iterations etc),
5. the sample size,
6. the number of parameters to estimate and,
7. the restriction imposed on parameters.

The aspects above are covered in this chapter.