## 10.2 Conditional expectation and variance of ADAMX

### 10.2.1 ADAMX with deterministic explanatory variables

ETS models have a severe limitation, which will be discussed in chapter 16: they assume that the model’s parameters are known, i.e. there is no variability in them and that the in-sample estimates are fixed no matter how the sample size changes. This limitation also impacts the ETSX part. While in the case of point forecasts, this is not important, this affects the conditional variance and prediction intervals. As a result, the conditional mean and variance of the conventional ADAMX assume that the parameters \(a_0, \dots a_n\) are also known, leading to the following formulae in the case of the pure additive model, based on what was discussed in the section 5.3: \[\begin{equation} \begin{aligned} \mu_{y,t+h} = \text{E}(y_{t+h}|t) = & \sum_{i=1}^d \left(\mathbf{w}_{m_i,t}' \mathbf{F}_{m_i}^{\lceil\frac{h}{m_i}\rceil-1} \right) \mathbf{v}_{t} \\ \text{V}(y_{t+h}|t) = & \left( \sum_{i=1}^d \left(\mathbf{w}_{m_i,t}' \sum_{j=1}^{\lceil\frac{h}{m_i}\rceil-1} \mathbf{F}_{m_i}^{j-1} \mathbf{g}_{m_i} \mathbf{g}'_{m_i} (\mathbf{F}_{m_i}')^{j-1} \mathbf{w}_{m_i,t} \right) + 1 \right) \sigma^2 \end{aligned}, \tag{10.11} \end{equation}\] the main difference from the moments of the conventional model (from Section 5.3) being is the index \(t\) in the measurement vector \(\mathbf{w}_t\). As an example, here how the two statistics will look in case of ETSX(A,N,N): \[\begin{equation} \begin{aligned} \mu_{y,t+h} = \text{E}(y_{t+h}|t) = & l_{t} + \sum_{i=1}^n a_i x_{i,t+h} \\ \text{V}(y_{t+h}|t) = & \left((h-1) \alpha^2 + 1 \right) \sigma^2 \end{aligned}, \tag{10.12} \end{equation}\] where the variance ignores the potential variability rising from the explanatory variables because of the ETS limitations. This assumes that the future values of explanatory variables \(x_{i,t}\) are known (the variable is deterministic). As a result, the prediction and confidence intervals for the ADAMX model would typically be narrower than expected and would only be adequate in cases of large samples, where the law of large numbers would start working (Section 4.2 of Svetunkov, 2022a), reducing the variance of parameters (this is assuming that the typical assumptions of the model from Section 1.4.1 hold).

### 10.2.2 ADAMX with stochastic explanatory variables

Note that the ADAMX works well in cases when the future values of \(x_{i,t+h}\) are known. It is a realistic assumption when we control the explanatory variables (e.g. prices and promotions for our product). But when the variables are out of our control, they need to be forecasted somehow. In this case we are assuming that each \(x_{i,t}\) is a stochastic variable with some dynamic conditional one step ahead expectation \(\mu_{x_{i,t}}\) and a one step ahead variance \(\sigma^2_{x_{i,1}}\). **Note** that in this case, we treat the available explanatory variables as models on their own, not just as values given to us from above. This assumption of randomness will change the conditional moments of the model. Here is what we will have in the case of ETSX(A,N,N) (given that the typical assumptions from Section 1.4.1 hold):
\[\begin{equation}
\begin{aligned}
\mu_{y,t+h} = \text{E}(y_{t+h}|t) = & l_{t} + \sum_{i=1}^n a_i \mu_{x_{i,t+h}} \\
\text{V}(y_{t+h}|t) = & \left((h-1) \alpha^2 + 1 \right) \sigma^2 + \sum_{i=1}^n a^2_i \sigma^2_{x_{i,h}} + 2 \sum_{i=1}^{n-1} \sum_{j=i+1}^n a_i a_j \sigma_{x_{i,h},x_{j,h}}
\end{aligned},
\tag{10.13}
\end{equation}\]
where \(\sigma^2_{x_{i,h}}\) is the conditional variance of \(x_{i}\) h steps ahead, \(\sigma_{x_{i,h},x_{j,h}}\) is the h steps ahead covariance between the explanatory variables \(x_{i,h}\) and \(x_{j,h}\), both conditional on the information available at the observation \(t\). similarly, if we are interested in one step ahead point forecast from the model, it should take the randomness of explanatory variables into account and become:
\[\begin{equation}
\begin{aligned}
\mu_{y,t |t-1} = & \left. \mathrm{E}\left(l_{t-1} + \sum_{i=1}^n a_i x_{i,t} + \epsilon_{t} \right| t-1 \right) = \\
= & l_{t-1} + \sum_{i=1}^n a_i \mu_{x_{i,t}}
\end{aligned}.
\tag{10.14}
\end{equation}\]
So, in the case of ADAMX with random explanatory variables, the model should be constructed based on the expectations of those variables, not the random values themselves. This explains, for example, why Athanasopoulos et al. (2011) found that some models with predicted explanatory variables work better than the model with the variables themselves. This means that, when estimating the model, such as ETS(A,N,N), the following should be constructed:
\[\begin{equation}
\begin{aligned}
& \hat{y}_{t} = \hat{l}_{t-1} + \sum_{i=1}^n \hat{a}_{i,t} \hat{x}_{i,t} \\
& e_t = y_t -\hat{y}_{t} \\
& \hat{l}_{t} = \hat{l}_{t-1} + \hat{\alpha} e_t \\
& \hat{a}_{i,t} = \hat{a}_{i,t-1} \text{ for each } i \in \{1, \dots, n\}
\end{aligned},
\tag{10.15}
\end{equation}\]
where \(\hat{x}_{i,t}\) is the in-sample conditional one step ahead mean for the explanatory variable \(x_i\).

Finally, as discussed previously, the conditional moments for the pure multiplicative and mixed models do not generally have closed forms, implying that the simulations need to be carried out. The situation becomes more challenging in the case of random explanatory variables because that randomness needs to be introduced in the model itself and propagated throughout the time series. This is not a trivial task, which we will discuss in Section 16.4.