What’s wrong with ARIMA?

Have you heard of ARIMA? It is one of the benchmark forecasting models used in different academic experiments, although it is not always popular among practitioners. But why? What’s wrong with ARIMA?

ARIMA has been a standard forecasting model in statistics for ages. It gained popularity with the famous Box & Jenkins (1970) book and was considered the best forecasting model by statisticians for a couple of decades without any strong evidence to support this.

It represents one of the two fundamental approaches to time series modeling (the second being the state space approach): it captures the relation between the variable and itself in the past. This has a great rationale in technical areas. For example, the quantity of CO2 in a furnace at this moment in time will depend on the quantity of CO2 five minutes ago. Such processes can be efficiently modeled and then forecasted using ARIMA. In demand forecasting, making sense of ARIMA is more challenging: it is hard to argue that the demand for shoes on Monday can impact the demand on Tuesday. So, when we apply ARIMA to such data, we sort of rely on a spurious relation. Still, demand data often exhibits autocorrelations, and ARIMA has been used efficiently in that context.

Over the years, ARIMA did not perform well in different competitions (see my post about that), but this was mainly due to the wrong assumptions in Box-Jenkins methodology, not because the model has been fundamentally bad. After Hyndman & Khandakar (2006) implemented their version with automatic order selection based on information criteria, ARIMA has started producing much more accurate forecasts.

But if I were to summarize what the problem with the model is, I would outline these points:

  1. It is hard to explain ARIMA to people who are not comfortable with statistics. Here is an example of how seasonal ARIMA(1,0,1)(1,0,1)_4 is written mathematically:
    y_t (1 -\phi_{4,1} B^4)(1 -\phi_{1} B) = \epsilon_t (1 + \theta_{4,1} B^4) (1 + \theta_{1} B).
    Good luck explaining this to a demand planner who does not know mathematics.
  2. It is hard to estimate, especially for models with seasonality. It is typically estimated using some numeric optimisation, and reaching the maximum likelihood (or a global minimum of a loss function) is not guaranteed.
  3. It is hard to select the appropriate order of the model, as there can be thousands of potential models to choose from. Yes, there are heuristic approaches that allow simplifying the problem and selecting a reasonable model (e.g. Hyndman & Khandakar, 2006; or Svetunkov & Boylan, 2017), but they do not guarantee that you will get the best possible model.

Nonetheless, ARIMA is a strong contender that can outperform other models if implemented well. Furthermore, it has become one of the standard forecasting benchmarks in forecasting-related experiments. So, if you are a data scientist comfortable with mathematics and want to see how your machine learning approach performs, you should consider including ARIMA as a benchmark.

P.S. Check out a post by Nicolas Vandeput on LinkedIn – he had a discussion about ARIMA and raised good points as well.

Leave a Reply