**Open Review**. I want your feedback to make the book better for you and other readers. To add your annotation, select some text and then click the on the pop-up menu. To see the annotations of others, click the button in the upper right hand corner of the page

## 10.2 Conditional expectation and variance of ADAMX

### 10.2.1 The ADAMX with known explanatory variables

ETS models have a serious limitation, which we will discuss in one of the latter chapters: they assume that the parameters of the model are known, i.e. there is no variability in them and that the in-sample values are fixed no matter what. This limitation also impacts the ETSX part. While in case of point forecasts this is not an issue, this impacts the conditional variance and prediction intervals. As a result, the conditional mean and variance of the conventional ADAMX assume that the parameters \(a_0, \dots a_n\) are also known, leading to the following formulae in case of pure additive model, based on what was discussed in the section on pure additive models: \[\begin{equation} \begin{aligned} \mu_{y,t+h} = \text{E}(y_{t+h}|t) = & \sum_{i=1}^d \left(\mathbf{w}_{m_i,t}' \mathbf{F}_{m_i}^{\lceil\frac{h}{m_i}\rceil-1} \right) \mathbf{v}_{t} \\ \text{V}(y_{t+h}|t) = & \left( \sum_{i=1}^d \left(\mathbf{w}_{m_i,t}' \sum_{j=1}^{\lceil\frac{h}{m_i}\rceil-1} \mathbf{F}_{m_i}^{j-1} \mathbf{g}_{m_i} \mathbf{g}'_{m_i} (\mathbf{F}_{m_i}')^{j-1} \mathbf{w}_{m_i,t} \right) + 1 \right) \sigma^2 \end{aligned}, \tag{10.9} \end{equation}\] the main difference from the conventional model being is the index \(t\) in the measurement vector. As an example, here how the two statistics will look in case of ETSX(A,N,N): \[\begin{equation} \begin{aligned} \mu_{y,t+h} = \text{E}(y_{t+h}|t) = & l_{t} + \sum_{i=1}^n a_i x_{i,t+h} \\ \text{V}(y_{t+h}|t) = & \left((h-1) \alpha^2 + 1 \right) \sigma^2 \end{aligned}, \tag{10.10} \end{equation}\] where the variance ignores the potential variability rising from the explanatory variables because of the ETS limitations. As a result, the prediction and confidence intervals for the ADAMX model would typically be narrower than needed and would only be adequate in cases of large samples, where law of large numbers would start working, reducing the variance of parameters (this is assuming that the typical assumptions of the model hold).

### 10.2.2 ADAMX with random explanatory variables

Note that the ADAMX works well in cases, when the future values of \(x_{i,t+h}\) are known, which is not always the case. It is a realistic assumption, when we have control over the explanatory variables (e.g. prices and promotions for our product). But in the case, when the variables are out of our control, they need to be forecasted somehow. In this case we are assuming that each \(x_{i,t}\) is a random variable with some dynamic conditional one step ahead expectation \(\mu_{x_{i,t}}\) and a one step ahead variance \(\sigma^2_{x_{i,1}}\). **Note** that in this case we treat the available explanatory variables as models on their own, not just as values given to us from above. This assumption of randomness will change the conditional moments of the model. Here what we will have in case of ETSX(A,N,N) (given that the typical assumptions hold):
\[\begin{equation}
\begin{aligned}
\mu_{y,t+h} = \text{E}(y_{t+h}|t) = & l_{t} + \sum_{i=1}^n a_i \mu_{x_{i,t+h}} \\
\text{V}(y_{t+h}|t) = & \left((h-1) \alpha^2 + 1 \right) \sigma^2 + \sum_{i=1}^n a^2_i \sigma^2_{x_{i,h}} + 2 \sum_{i=1}^{n-1} \sum_{j=i+1}^n a_i a_j \sigma_{x_{i,h},x_{j,h}}
\end{aligned},
\tag{10.11}
\end{equation}\]
where \(\sigma^2_{x_{i,h}}\) is the variance of \(x_{i}\) h steps ahead, \(\sigma_{x_{i,h},x_{j,h}}\) is the h steps ahead covariance between the explanatory variables \(x_{i,h}\) and \(x_{j,h}\), both conditional on the information available at the observation \(t\). similarly, if we are interested in one step ahead point forecast from the model, it should take the randomness of explanatory variables into account and become:
\[\begin{equation}
\begin{aligned}
\mu_{y,t} = & \left. \mathrm{E}\left(l_{t-1} + \sum_{i=1}^n a_i x_{i,t} + \epsilon_{t} \right| t-1 \right) = \\
= & l_{t-1} + \sum_{i=1}^n a_i \mu_{x_{i,t}}
\end{aligned}.
\tag{10.12}
\end{equation}\]
So, in case of ADAMX with random explanatory variables, the model should be constructed based on the expectations of those variables, not the random values themselves. This does not appear in the context of the classical linear regression, because it does not rely on the one step ahead recursion. But this explains, for example, why Athanasopoulos et al. (2011) found that some models with predicted explanatory variables works better than the model with the variables themselves. This becomes important, when estimating the model, such as ETS(A,N,N), when the following is constructed:
\[\begin{equation}
\begin{aligned}
\hat{y}_{t} = & \hat{l}_{t-1} + \sum_{i=1}^n \hat{a}_{i,t} \hat{x}_{i,t} \\
e_t = & y_t - \hat{y}_{t} \\
\hat{l}_{t} = & \hat{l}_{t-1} + \hat{\alpha} e_t \\
\hat{a}_{i,t} = & \hat{a}_{i,t-1} \text{ for each } i \in \{1, \dots, n\}
\end{aligned},
\tag{10.12}
\end{equation}\]
where \(\hat{x}_{i,t}\) is the in-sample conditional one step ahead mean for the explanatory variable \(x_i\).

Summarising this section, the adequate ADAMX model needs to be able to work in at least two regimes: (1) assuming that the explanatory variable is known, (2) assuming that the explanatory variable is random.

Finally, as discussed previously, the conditional moments for the pure multiplicative and mixed models do not have closed forms in general, implying that the simulations need to be carried out. The situation becomes more challenging in case of random explanatory variables, because that randomness needs to be introduced in the model itself and propagated throught the time series. This is not a trivial task, which we will discuss later in this textbook.