## 14.3 Model specification: Transformations

The question of appropriate transformations for variables in the model is challenging, because it is difficult to decide, what sort of transformation is needed, if needed at all. In many cases, this comes to selecting between additive linear model and a multiplicative one. This implies that we compare the model: $\begin{equation} y_t = a_0 + a_1 x_{1,t} + \dots + a_n x_{n,t} + \epsilon_t, \tag{14.1} \end{equation}$ and $\begin{equation} y_t = \exp\left(a_0 + a_1 x_{1,t} + \dots + a_n x_{n,t} + \epsilon_t\right) . \tag{14.2} \end{equation}$ (14.2) is equivalent to the so called “log-linear” model, but can also include logarithms of explanatory variables instead of the variables themselves if needed.

There are different ways to diagnose the problem with wrong transformations. The first one is the actuals vs fitted plot (Figure 14.7):

plot(adamSeat03, 1, main="") Figure 14.7: Actuals vs Fitted for Model 3.

The grey dashed line on the plot in Figure 14.7 corresponds to the situation when actuals and fitted coincide (100% fit). The red line on the plot is the LOWESS line , produced by the lowess() function in R, smoothing the scatterplot to reflect the potential tendencies in the data. This red line should coincide with the grey line in the ideal situation. In addition, the variability around the line should not change with the increase of fitted values. In our case, there is a slight U-shape in the red line and a slight rise in variability around the middle of the data. This could either be due to pure randomness and thus should be ignored or indicate a slight non-linearity in the data. After all, we have constructed a pure additive model on the data that exhibits seasonality with multiplicative characteristics, which becomes especially apparent at the end of the series, where the drop in level is accompanied by the decrease of the variability of the data (Figure 14.8):

plot(adamSeat03, 7, main="") Figure 14.8: Actuals and Fitted values for Model 3.

To diagnose this properly, we might use other instruments. One of these is the analysis of standardised residuals.

The formula for the standardised residuals $$u_t$$ will differ depending on the assumed distribution and for some of them comes to the value inside the “$$\exp$$” part of the probability density function:

1. Normal, $$\epsilon_t \sim \mathcal{N}(0, \sigma^2)$$: $$u_t = \frac{e_t -\bar{e}}{\hat{\sigma}}$$;
2. Laplace, $$\epsilon_t \sim \mathcal{Laplace}(0, s)$$: $$u_t = \frac{e_t -\bar{e}}{\hat{s}}$$;
3. S, $$\epsilon_t \sim \mathcal{S}(0, s)$$: $$u_t = \frac{e_t -\bar{e}}{\hat{s}^2}$$;
4. Generalised Normal, $$\epsilon_t \sim \mathcal{GN}(0, s, \beta)$$: $$u_t = \frac{e_t -\bar{e}}{\hat{s}^{\frac{1}{\beta}}}$$;
5. Inverse Gaussian, $$1+\epsilon_t \sim \mathcal{IG}(1, \sigma^2)$$: $$u_t = \frac{1+e_t}{\bar{e}}$$;
6. Gamma, $$1+\epsilon_t \sim \mathcal{\Gamma}(\sigma^{-2}, \sigma^2)$$: $$u_t = \frac{1+e_t}{\bar{e}}$$;
7. Log Normal, $$1+\epsilon_t \sim \mathrm{log}\mathcal{N}\left(-\frac{\sigma^2}{2}, \sigma^2\right)$$: $$u_t = \frac{e_t -\bar{e} +\frac{\hat{\sigma}^2}{2}}{\hat{\sigma}}$$.

Here $$\bar{e}$$ is the mean of residuals, which is typically assumed to be zero, and $$u_t$$ is the value of standardised residuals. Note that the scales in the formulae above should be calculated via the formula with the bias correction, i.e. with the division by degrees of freedom, not the number of observations. Also, note that in the case of $$\mathcal{IG}$$, $$\Gamma$$ and $$\mathrm{log}\mathcal{N}$$ and additive error models, the formulae for the standardised residuals will be the same, only the assumptions will change (see Section 5.5).

Here is an example of a plot of fitted vs standardised residuals in R (Figure 14.9):

plot(adamSeat03, 2, main="") Figure 14.9: Standardised residuals vs Fitted for pure additive ETSX model.

Given that the scale of the original variable is now removed in the standardised residuals, it might be easier to spot the non-linearity. In our case, in Figure 14.9, it is still not apparent, but there is a slight curvature in the LOWESS line and a slight change in the variance. Another plot that we have already used before is standardised residuals over time (Figure 14.10):

plot(adamSeat03, 8, main="") Figure 14.10: Standardised residuals vs Time for pure additive ETSX model.

The plot in Figure 14.10 does not show any apparent non-linearity in the residuals, so it is not clear whether any transformations are needed or not.

However, based on my judgment and understanding of the problem, I would expect the number of injuries and deaths to change proportionally to the change of the level of the data. If, after some external interventions, the overall level of injuries and deaths would decrease, we would expect a percentage decline, not a unit decline with a change of already existing variables in the model. This is why I will try a multiplicative model next (transforming explanatory variables as well):

adamSeat05 <- adam(Seatbelts, "MNM",
formula=drivers~log(PetrolPrice)+log(kms)+law)
plot(adamSeat05, 2, main="") Figure 14.11: Standardised residuals vs Fitted for pure multiplicative ETSX model.

The plot in Figure 14.11 shows that the variability is now slightly more uniform across all fitted values, but the difference between Figures 14.9 and 14.11 is not very prominent. One of the potential solutions in this situation is to compare the models in terms of information criteria:

setNames(c(AICc(adamSeat03), AICc(adamSeat05)),
c("Additive model", "Multiplicative model"))
##       Additive model Multiplicative model
##             2424.123             2406.366

Based on this, we would be inclined to select the multiplicative model. My judgment in this specific case agrees with the information criterion.

We could also investigate the need for transformations of explanatory variables, but the interested reader is encouraged to do this analysis on their own.

Finally, the non-linear transformations are not limited with logarithm only, there are more of them, some of which are discussed in Chapter 11 of Svetunkov (2022a).

### References

• Cleveland, W.S., 1979. Robust Locally Weighted Regression and Smoothing Scatterplots. Journal of the American Statistical Association. 74, 829–836. https://doi.org/10.2307/2286407
• Svetunkov, I., 2022a. Statistics for business analytics. https://openforecast.org/sba/ (version: 31.03.2022)