smooth in python: Non-normal distributions in ETS/ARIMA

So, you know quite well that the normal distribution is one of the most popular distributions in statistics. The reasons are manifold, including convenience for the academic community and the fact that it is taught in every single statistics course in the world. But what if we don’t want to be normal?

There are situations where non-normal distributions fit considerably better. The main candidate for substitution is the conditional distribution of the response variable. For example, sales of engines cannot follow the normal distribution by definition: they are intermittent and integer-based — you cannot sell 1.78 engines. More generally, while demand can be fractional, it cannot be negative. It is therefore only logical to use distributions that support positive values only in these situation. Examples include Log-Normal, Gamma, and Inverse Gaussian, among many others.

In my last paper with John Boylan (this one), we discussed how ETS can be extended to use these three distributions instead of the normal one. I implemented this functionality (together with the others, such as Laplace and Generalised Normal) in ADAM. It supports any of these distributions with any ETS/ARIMA/regression model, for both additive and multiplicative error terms. There is some maths involved, which you can find here and here.

Why bother? The main is in the predictive distribution. If the data is not normal, we may end up with poorly calibrated forecasts and misleading prediction intervals. Using a more appropriate distribution can resolve this.

But how do we choose the right distribution for our data?

A possible solution (similar to selecting ETS components) is to fit models with different distributions and pick the one with the lowest information criterion. This is implemented in the ADAM function from the smooth package. We can do this manually, or use AutoADAM (called auto.adam in R) to select the most suitable distribution based on AICc automatically:

from fcompdata import AirPassengers
from smooth import AutoADAM

model = AutoADAM(lags=[12], h=12, holdout=True, orders=None, verbose=True)
model.fit(AirPassengers.y)
model.summary()

The orders=None line stops the function from trying different ARIMA orders – something we will come back to in a future post. For this example, the output is:

Model estimated using ADAM() function: ETS(MAM)
Response variable: y
Distribution used in the estimation: Normal
Loss function type: likelihood; Loss function value: 523.2756
Coefficients:
       Estimate  Std. Error  Lower 2.5%  Upper 97.5%   
alpha    0.7575      0.0895      0.5807       0.9343  *
beta     0.0000      0.0080      0.0000       0.0158   
gamma    0.0000      0.0503      0.0000       0.0994   
Error standard deviation: 0.0358
Sample size: 144
Number of estimated parameters: 4
Number of degrees of freedom: 140
Information criteria:
      AIC     AICc       BIC      BICc
1054.5512 1054.839 1066.4305 1067.1455

Boring… the function found that the Normal distribution has the lowest AICc among those tested – the Air Passengers data is too well-behaved.

Oh, and don’t forget to produce the forecasts:

model.predict(h=18, interval="prediction")

Smooth forecasting!

Install smooth: pip install smooth

Leave a Reply