Chapter 14 Model diagnostics

In this chapter, we investigate how ADAM models can be diagnosed and improved. Most topics will build upon the typical model assumptions discussed in Subsection 1.4.1 and Section 12 of Svetunkov (2021b) textbook. Some of the assumptions cannot be diagnosed properly, but there are well-established instruments for the others. We will consider the following assumptions and discuss how to check whether they are violated or not:

  1. Model is correctly specified:
  1. No omitted variables;
  2. No redundant variables;
  3. The necessary transformations of the variables are applied;
  4. No outliers in the model.
  1. Residuals are i.i.d.:
  1. They are not autocorrelated;
  2. They are homoscedastic;
  3. The expectation of residuals is zero, no matter what;
  4. The residuals follow the specified distribution;
  5. The distribution of residuals does not change over time.
  1. The explanatory variables are not correlated with anything but the response variable:
  1. No multicollinearity;
  2. No endogeneity (not discussed in the context of ADAM).

All the model diagnostics is aimed at spotting patterns in residuals. If there are patterns, then something is probably missing in the model. In this chapter, we will discuss which instruments can be used to diagnose different types of assumptions.

Note that the proposed analysis is based mainly on visual inspection of various plots. While there are statistical tests for some assumptions, we do not discuss them here. This is because human judgment is typically more reliable than just p-values, and people tend to misuse the latter.

To make this more actionable, we will consider a conventional regression model on Seatbelts data, discussed in Section 10.6. We start with pure regression, which can be estimated equally well with the adam() function from the smooth package or the alm() from the greybox in R. In general, I recommend using alm() when no dynamic elements are present in the model. Otherwise, use adam() in the following way:

adamModelSeat01 <- adam(Seatbelts, "NNN",
                        formula=drivers~PetrolPrice+kms)
plot(adamModelSeat01, 7, main="")
Basic regression model for the data of Road Casualties in Great Britain 1969–84.

Figure 14.1: Basic regression model for the data of Road Casualties in Great Britain 1969–84.

This model has several issues, and in this chapter, we will discuss how to diagnose and fix them.

References

• Svetunkov, I., 2021b. Statistics for business analytics. https://openforecast.org/sba/ (version: 01.10.2021)