10.1 ADAMX: Model formulation

As discussed previously, there are two types of errors in ADAM:

  1. Additive discussed in Hyndman et al. (2008) in Chapter 5 in case of ETS and Chapter 9 for ARIMA,
  2. Multiplicative covered in Chapter 6 for ETS and in Subsection 9.1.4.

The inclusion of explanatory variables in ADAMX is determined by the type of the error, so that in case of (1) the measurement equation of the model is: \[\begin{equation} {y}_{t} = a_{0,t} + a_{1,t} x_{1,t} + a_{2,t} x_{2,t} + \dots + a_{n,t} x_{n,t} + \epsilon_t , \tag{10.1} \end{equation}\] where \(a_{0,t}\) is the point value based on all ETS components (for example, \(a_{0,t}=l_{t-1}\) in case of ETS(A,N,N)), \(x_{i,t}\) is the \(i\)-th explanatory variable, \(a_{i,t}\) is its parameter and \(n\) is the number of explanatory variables. We will call the estimated parameters of such model \(\hat{a}_{i,t}\). In the simple case, the transition equation for such model would imply that the parameters \(a_{i,t}\) do not change over time: \[\begin{equation} a_{i,t} = a_{i,t-1} \text{ for all } i = 1, \dots, n \tag{10.2} \end{equation}\] Various complex mechanisms for the states update can be proposed instead of (10.2), but we do not discuss them at this point. Typically, the initial values of parameters would be estimated at the optimisation stage, either based on likelihood or some other loss function, so the index \(t\) can be dropped, substituting \(a_{i,t}=a_{i}\) for all \(i=1,\dots,n\).

When it comes to the multiplicative error model, it should be formulated differently. The most straight forward would be to formulate the model in logarithms in order to linearise it: \[\begin{equation} \log {y}_{t} = \log a_{0,t} + a_{1,t} x_{1,t} + a_{2,t} x_{2,t} + \dots + a_{n,t} x_{n,t} + \log(1+ \epsilon_t). \tag{10.3} \end{equation}\]

Remark. If log-log model is required, all that needs to be done, is to substitute \(x_{i,t}\) with \(\log x_{i,t}\).

The compact form of the ADAMX model implies that the explanatory variables \(x_{i,t}\) are included in the measurement vector \(\mathbf{w}_{t}\), making it change over time. The parameters are then moved to the state vector, and a diagonal matrix is added to the existing transition matrix. Finally, the persistence vector for the parameters of explanatory variables should contain zeroes. The state space model, in that case, can be represented as: \[\begin{equation} \begin{aligned} & {y}_{t} = \mathbf{w}'_t \mathbf{v}_{t-\mathbf{l}} + \epsilon_t \\ & \mathbf{v}_t = \mathbf{F} \mathbf{v}_{t-\mathbf{l}} + \mathbf{g} \epsilon_t \end{aligned} \tag{10.4} \end{equation}\] for the pure additive and \[\begin{equation} \begin{aligned} {y}_{t} = & \exp\left(\mathbf{w}'_t \log \mathbf{v}_{t-\mathbf{l}} + \log(1 + \epsilon_t)\right) \\ \log \mathbf{v}_t = & \mathbf{F} \log \mathbf{v}_{t-\mathbf{l}} + \log(\mathbf{1}_k + \mathbf{g} \epsilon_t) \end{aligned}. \tag{10.5} \end{equation}\] for the pure multiplicative models. So, the only thing that changes in these models is the time varying measurement vector \(\mathbf{w}'_t\) instead of the fixed one. For example, in case of ETSX(A,Ad,A) we will have: \[\begin{equation} \begin{aligned} \mathbf{F} = \begin{pmatrix} 1 & \phi & 0 & 0 & \dots & 0 \\ 0 & \phi & 0 & 0 & \dots & 0 \\ 0 & 0 & 1 & 0 & \dots & 0 \\ 0 & 0 & 0 & 1 & \dots & 0 \\ \vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & 0 & \dots & 1 \end{pmatrix}, & \mathbf{w}_t = \begin{pmatrix} 1 \\ \phi \\ 1 \\ x_{1,t} \\ \vdots \\x_{n,t} \end{pmatrix}, & \mathbf{g} = \begin{pmatrix} \alpha \\ \beta \\ \gamma \\ 0 \\ \vdots \\ 0 \end{pmatrix}, \\ & \mathbf{v}_{t} = \begin{pmatrix} l_t \\ b_t \\ s_t \\ a_{1,t} \\ \vdots \\ a_{n,t} \end{pmatrix}, & \mathbf{l} = \begin{pmatrix} 1 \\ 1 \\ m \\ 1 \\ \vdots \\ 1 \end{pmatrix} \end{aligned}, \tag{10.6} \end{equation}\] which is equivalent to the combination of equations (10.1) and (10.2): \[\begin{equation} \begin{aligned} & y_{t} = l_{t-1} + \phi b_{t-1} + s_{t-m} + a_{1,t} x_{1,t} + \dots + a_{n,t} x_{n,t} + \epsilon_t \\ & l_t = l_{t-1} + \phi b_{t-1} + \alpha \epsilon_t \\ & b_t = \phi b_{t-1} + \beta \epsilon_t \\ & s_t = s_{t-m} + \gamma \epsilon_t \\ & a_{1,t} = a_{1,t-1} \\ & \vdots \\ & a_{n,t} = a_{n,t-1} \end{aligned}. \tag{10.7} \end{equation}\] Alternatively, the state, measurement and persistence vectors and transition matrix can be split each into two parts, separating the ETS and X parts in the state space equations: \[\begin{equation} \begin{aligned} & {y}_{t} = \mathbf{w}' \mathbf{v}_{t-\mathbf{l}} + \mathbf{x}'_{t} \mathbf{a}_{t-1} + \epsilon_t \\ & \mathbf{v}_{1,t} = \mathbf{F} \mathbf{v}_{t-\mathbf{l}} + \mathbf{g} \epsilon_t \\ & \mathbf{a}_{t} = \mathbf{a}_{t-1} \end{aligned} , \tag{10.8} \end{equation}\] where \(\mathbf{w}\), \(\mathbf{F}\), \(\mathbf{g}\) and \(\mathbf{v}_{t}\) contain the elements of the conventional components of ADAM and \(\mathbf{a}_{t}\) is the vector of parameters for the explanatory variables.

When all the smoothing parameters of the ETS part of the model are equal to zero, the ETSX reverts to a deterministic model, directly related to the multiple linear regression. For example, in case of ETSX(A,N,N) with \(\alpha=0\) we get: \[\begin{equation} \begin{aligned} & y_{t} = l_{t-1} + a_{1,t} x_{1,t} + \dots + a_{n,t} x_{n,t} + \epsilon_t \\ & l_t = l_{t-1} \\ & a_{1,t} = a_{1,t-1} \\ & \vdots \\ & a_{n,t} = a_{n,t-1} \end{aligned}, \tag{10.9} \end{equation}\] where \(l_t=a_0\) is the intercept of the model. (10.9) can be rewritten in the conventional way, dropping the transition part of the state space model: \[\begin{equation} y_{t} = a_0 + a_{1} x_{1,t} + \dots + a_{n} x_{n,t} + \epsilon_t . \tag{10.10} \end{equation}\] In the case of models with trend and/or seasonal, the model becomes equivalent to the regression with deterministic trend and/or seasonality. This means that, in general, ADAMX implies that we are dealing with a regression with time-varying intercept, where the principles of this variability are defined by the ADAM components (e.g. intercept can vary seasonally). Similar properties are obtained with the multiplicative error model. The main difference is that the impact of explanatory variables on the response variable will vary with the intercept changes. The model, in this case, combines the strengths of the multiplicative regression and the dynamic model, where the variability of the response variable changes with the change of the baseline model (ADAM ETS and/or ADAM ARIMA in this case).

References

• Hyndman, R.J., Koehler, A.B., Ord, J.K., Snyder, R.D., 2008. Forecasting with Exponential Smoothing.. Springer Berlin Heidelberg.