1.2 Forecasting principles
If you have decided that you need to forecast something, then it makes sense to keep several important forecasting principles in mind.
First, as discussed earlier, you need to understand why the forecast is needed, how it will be used and by whom. Answers to these questions will guide you in deciding, what technique to use, how specifically to do forecasting and what should be reported. For example, if a client does not know machine learning, it might be unwise to use Neural Networks for forecasting - they will not trust the technique and thus will not trust the forecasts, switching to simpler methods. If the final decision is to order some number of units, then it would be more reasonable to produce cumulative forecasts over the lead time (time between the order and product delivery) and form safety stock based on the model and assumed distribution.
When you have an understanding of what to forecast and how, the second principle comes into play. Select the relevant error measures. You need to decide how to measure the accuracy of forecasting methods, and that accuracy needs to be as close to the final decision, as possible. For example, if you need to decide the number of nurses for a specific day in the A&E department based on the patients attendance, then it would be more reasonable to compare models in terms of their quantile performance (see Section 2.2) rather than expectation or median. Thus, it would be more appropriate to calculate pinball loss instead of MAE or RMSE (see details in Section 2).
Third, you should always test your models on a sample of data not seen by them. Train your model on one part of a sample and test it on another one. This way you can have some guarantees that the model will not overfit the data and when you need to produce a final forecast, it will be reasonable. Yes, there are cases, when you do not have enough data to do that. All you can do in these situations, is use simpler, robust models (for example, such as damped trend exponential smoothing by Roberts (1982) and Gardner and McKenzie (1985) or Theta by Assimakopoulos and Nikolopoulos (2000)) and to use judgment in deciding, whether the final forecasts are reasonable or not. But in all the other cases, you should test model on the data they are not aware of. The recommended approach in this case is rolling origin, discussed in more detail in Section 2.4.
Fourth, you need to have benchmark models. Always compare forecasts from your favourite approach with those from Naïve, global average and / or regression - depending on what you do specifically. If your fancy Neural Network performs worse than Naïve, then it does not bring value and should not be used in practice. Comparing one Neural Network with another is also not a good idea, because Simple Exponential Smoothing (see Section 4.1), being much simpler model, might beat both networks, and you would never find out about that. If possible, also compare forecasts from the proposed approach with forecasts of other well established benchmarks, such as ETS (Hyndman et al., 2008), ARIMA (Box and Jenkins, 1976) and Theta (Assimakopoulos and Nikolopoulos, 2000).
Finally, when comparing forecasts from different models, you might end up with several very similar performing approaches. If the difference between them is not significant, then the general recommendation is to select the one that is faster and simpler. This is because simpler models are more difficult to break and those that work faster are more attractive in practice due to reduced energy consumption (let’s save the planet!).
These forecasting principles do not guarantee that you will end up with the most accurate results, but at least you will not end up with the unreasonable ones.