**Open Review**. I want your feedback to make the book better for you and other readers. To add your annotation, select some text and then click the on the pop-up menu. To see the annotations of others, click the button in the upper right hand corner of the page

## 10.3 Dynamic X in ADAMX

**Note**: the model discussed in this section assumes a very specific dynamics of parameters (that they are correlated with other states of the model), aligning with what the conventional ETS assumes. It does not treat parameters as independent as the MSOE state space models do. But this type of model works well with categorical variables as I show later in this section.

As discussed earlier in this chapter, the parameters of the explanatory variables in ADAMX can be assumed to stay constant over time or can be assumed to vary according to some mechanism. The most reasonable mechanism in SSOE framework is the one relying on the same error for different components of the model. Osman and King (2015) proposed one of such mechanisms, relying on the differences of the data. The main motivation of the research was to make the dynamic ADAMX model stable, which is a challenging task. However, this mechanism relies on the assumption of non-stationarity of the explanatory variables, which does not always make sense (for example, it is not reasonable in case of promotional data). An alternative approach that we will discuss in this section, is the one originally proposed by Svetunkov (1985) based on stochastic approximation mechanism and further developed in Svetunkov and Svetunkov (2014).

In this method, we consider the following regression model: \[\begin{equation} y_{t} = a_{0,t-1} + a_{1,t-1} x_{1,t} + \dots + a_{n,t-1} x_{n,t} + \epsilon_t , \tag{10.13} \end{equation}\] where all parameters vary over time and \(a_{0,t}\) represents the value from the conventional pure additive ETS model. The updating mechanism for the parameters is straight forward and relies on the ratio of the error term and the respective explanatory variables: \[\begin{equation} a_{i,t} = a_{i,t-1} + \left \lbrace \begin{aligned} &\delta_i \frac{\epsilon_t}{x_{i,t}} \text{ for each } i \in \{1, \dots, n\}, \text{ if } x_{i,t}\neq 0 \\ &0 \text{ otherwise } \end{aligned} \right. , \tag{10.14} \end{equation}\] where \(\delta_i\) is the smoothing parameter of the \(i\)-th explanatory variable. The same model can be represented in the state space form, based on the equations, similar to (10.3): \[\begin{equation} \begin{aligned} {y}_{t} = & \mathbf{w}'_t \mathbf{v}_{t-\boldsymbol{l}} + \epsilon_t \\ \mathbf{v}_t = & \mathbf{F} \mathbf{v}_{t-\boldsymbol{l}} + \mathrm{diag}\left(\mathbf{w}_t\right)^{-1} \mathbf{g} \epsilon_t \end{aligned} \tag{10.15} \end{equation}\] where \(\mathrm{diag}\left(\mathbf{w}_t\right)^{-1}=\mathbf{I}_{k+n} \odot (\mathbf{w}_t \mathbf{1}_{k+n})\) (where \(\mathbf{I}_{k+n}\) is the identity matrix for \(k\) ADAM components and \(n\) explanatory variables and \(\odot\) is Hadamard product for element-wise multiplication). This is the inverse of the diagonal matrix based on the measurement vector, for which those values that cannot be inverted (due to division by zero) are substitute by zeroes in order to reflect the condition in (10.14). In addition to what (10.3) contained, we add smoothing parameters \(\delta_i\) in the persistence vector \(\mathbf{g}\) for each of the explanatory variables.

If the error term is multiplicative, then the model changes to: \[\begin{equation} \begin{aligned} y_{t} = & \exp \left(a_{0,t-1} + a_{1,t-1} x_{1,t} + \dots + a_{n,t-1} x_{n,t} + \log(1+ \epsilon_t) \right) \\ a_{i,t} = & a_{i,t-1} + \left \lbrace \begin{aligned} &\delta_i \frac{\log(1+\epsilon_t)}{x_{i,t}} \text{ for each } i \in \{1, \dots, n\}, \text{ if } x_{i,t}\neq 0 \\ &0 \text{ otherwise } \end{aligned} \right. \end{aligned} . \tag{10.16} \end{equation}\] The formulation (10.16) differs from the conventional pure multiplicative ETS model because the smoothing parameter \(\delta_i\) is not included inside the error term \(1+\epsilon_t\), which simplifies some derivations and makes model easier to work with. Mixed ETS models can also have explanatory variables, but we suggest to align the type of explanatory variables model with the error term.

Finally, in order to distinguish the ADAMX with static parameters from the ADAMX with dynamic ones, we will use the letters “S” and “D” in the names of models. So, the model (10.7) can be called ETSX(A,N,N){S}, while the model (10.16), assuming that \(a_{0,t-1}=l_{t-1}\), would be called ETSX(M,N,N){D}. We use curly brackets in order to split the ETS states from the type of X. Furthermore, given that the model with static regressors is assumed in many contexts to be the default one, the ETSX(*,*,*){S} model can also be denoted just ETSX(*,*,*).

### 10.3.1 Conditional moments of dynamic ADAMX

Similar to how it was discussed in the previous section, we can have two cases in the dynamic model: (1) when explanatory variables are assumed to be known, (2) when explanatory variables are assumed to be random. For illustrative purposes, we will use a non-seasonal model for which the lag vector contains ones only, keeping in mind that other pure additive models can be easily used instead. The cases of other ETS models are not discussed in this part in detail - the moments for these models need to be calculated based on simulations. So, as discussed previously, the model can be written in the following general way, assuming that all \(\boldsymbol{l}=1\): \[\begin{equation} \begin{aligned} {y}_{t} = & \mathbf{w}'_t \mathbf{v}_{t-1} + \epsilon_t \\ \mathbf{v}_t = & \mathbf{F} \mathbf{v}_{t-1} + \mathrm{diag}\left(\mathbf{w}_t\right)^{-1} \mathbf{g} \epsilon_t \end{aligned} . \tag{10.17} \end{equation}\] Based on this model, we can get the recursive relation for \(h\) steps ahead, similar to how it was done in one of the previos sections: \[\begin{equation} \begin{aligned} {y}_{t+h} = & \mathbf{w}'_{t+h} \mathbf{v}_{t+h-1} + \epsilon_{t+h} \\ \mathbf{v}_{t+h-1} = & \mathbf{F} \mathbf{v}_{t+h-2} + \mathrm{diag}\left(\mathbf{w}_{t+h-1}\right)^{-1} \mathbf{g} \epsilon_{t+h-1} \end{aligned} , \tag{10.18} \end{equation}\] where the second equation can be represented based on the values available on observation \(t\): \[\begin{equation} \mathbf{v}_{t+h-1} = \mathbf{F}^{h-1} \mathbf{v}_{t} + \sum_{j=1}^{h-1} \mathbf{F}^{h-1-j} \mathrm{diag}\left(\mathbf{w}_{t+j}\right)^{-1} \mathbf{g} \epsilon_{t+j} . \tag{10.19} \end{equation}\] Substituting the equation (10.19) in the measurement equation of (10.18) leads to the final recursion: \[\begin{equation} {y}_{t+h} = \mathbf{w}'_{t+h} \mathbf{F}^{h-1} \mathbf{v}_{t} + \mathbf{w}'_{t+h} \sum_{j=1}^{h-1} \mathbf{F}^{h-1-j} \mathrm{diag}\left(\mathbf{w}_{t+j}\right)^{-1} \mathbf{g} \epsilon_{t+j} + \epsilon_{t+h} . \tag{10.20} \end{equation}\]

### 10.3.2 Known explanatory variables

Based on this recursion, we can calculate the conditional mean and variance for the model. First, we assume that the explanatory variables are controlled by an analyst, so that they are not random: \[\begin{equation} \begin{aligned} & \mu_{y,t+h} = \text{E}(y_{t+h}|t) = \mathbf{w}'_{t+h} \mathbf{F}^{h-1} \mathbf{v}_{t} \\ & \text{V}(y_{t+h}|t) = \left(\mathbf{w}'_{t+h} \sum_{j=1}^{h-1} \mathbf{F}^{h-1-j} \mathrm{diag}\left(\mathbf{w}_{t+j}\right)^{-1} \mathbf{g} \right)^2 \sigma^2 + \sigma^2 \end{aligned} . \tag{10.21} \end{equation}\] The formulae for conditional moments in this case look similar to the ones from the pure additive ETS model with only difference being the interaction with time varying measurument vector.

### 10.3.3 Random explanatory variables

In the case of random explanatory variables, the conditional expectation is straightforward and is similar to the one in the static ADAMX model: \[\begin{equation} \mu_{y,t+h} = \text{E}(y_{t+h}|t) = \boldsymbol{\mu}'_{w,t+h} \mathbf{F}^{h-1} \mathbf{v}_{t} , \tag{10.22} \end{equation}\] where \(\boldsymbol{\mu}'_{w,t+h}\) is the vector of conditional h steps ahead expectations for each element in the \(\mathbf{w}_{t+h}\). In case of ETS components, the vector would contain ones. However, when it comes to the conditional variance, it is more complicated, because it introduces complex interactions between variances of different variables and error term. As a result, it would be easier to get the correct variance based on simulations, assuming that the explanatory variables and the error term change according to some assumed distributions.