Good news, everyone! “smooth” package is now available on CRAN. And it is time to look into what this package can do and why it is needed at all. The package itself contains some documentation that you can use as a starting point. For example, there are vignettes, which show included functions and what they allow to do, but do not go into details of what is happening inside those functions. If you don’t have a personal life and are ready to spend some time reading semi-scientific materials with formulae, then I have a thing especially for you – smooth documentation, which describes in details why models used in the package make sense, how they are optimised and what it leads to. Here we will try to look into some functions and their application to real life forecasting problems.
This is the first post in a series of posts on “smooth” functions. We start with es() – Exponential Smoothing.
What is es() and why do we need it?
“ES” stands for “Exponential Smoothing”. This function is an implementation of ETS model, alternative to Rob Hyndman’s ets() function from “forecast” package. The natural question would be: “Why bother if there already exists exponential smoothing function”? There are several reasons for that:
- ets() does not allow constructing some mixed models. For example, classical Holt-Winters model, which corresponds to ETS(A,A,M) is unavailable. This is not a big loss, but these models may be of interest to researchers. So es() function has all 30 of them;
- ets() does not have exogenous variables. es() allows providing either vector or matrix of exogenous variables. Note, however, that currently only additive ETS models are guaranteed to work well with exogenous variables. It will most probably work fine for other models, but I cannot guarantee stable performance;
- There are quite few papers in forecasting field that show that using combinations of forecasts of different models leads to increase in forecasting accuracy. Stephan Kolassa applied one neat method of combinations described in Burnham and Anderson book to exponential smoothing, showing that it leads to increase of forecast accuracy. So I took that method and implemented it in es() function;
- ets() function restricts number of seasonal coefficients by 24. The motivation here is that with higher number of coefficients, optimisers will do lousy job in finding optimal values of parameters. es() doesn’t have this restriction, claiming that users should be responsible for their own actions. Every action has consequences, you know? So, be smart and responsible!
- ets() function does not allow defining initial values for state vector. In general, who could care less, right? But they may become important in some cases, when optimisation of initial states is not an option. So I have implemented two methods of initialisation (optimisation and backcasting) + allowed users providing initial values if they really need to. This also allows addressing problem with large number of parameters because of number of seasonal coefficients. For example, if you deal with weekly data, then you can switch to backcasting or provide your own values for seasonal indices;
- MSE as cost function looked too boring, so I have introduced MAE, cost functions based on multiple steps ahead forecasting errors and exotic “Half Absolute Moment”. Why? Well, just in order to make life more exiting! (although there is a rationale for that, which is discussed in the documentation)
- There are several ways of constructing prediction intervals. es() allows selecting between parametric, semi-parametric and non-parametric ones. This additional flexibility allows dealing with cases, when for some reason basic assumptions of ETS do not hold;
- The derived underlying statistical model in es() differs slightly from ets(). This becomes especially evident, when models with multiplicative errors are used. There we assume that error term is distributed log-normally on the contrast to assumption of normality in ets();
- All the functions of “smooth” package allow dealing with intermittent data. This is based on a very recent research done in collaboration with John Boylan. So es() function allows producing forecasts using Croston’s model (not method, this is not a typo!) and some other intermittent data models. We can even select between intermittent and normal models using this approach. This is still an on-going research, so more models will follow with more advanced mechanisms;
- Last but not least, we have a fancy holdout parameter, which allows dividing provided time series into two parts, fitting model on the first one and assessing forecasts accuracy on the second.
Having said that, I am not claiming that es() function always produces more accurate forecasts than ets(). From what I have experienced so far the function is sometimes more accurate and sometimes less accurate than ets(). However the main point of the function is not in precision, but in flexibility. You can do much more with it, than with ets(), the only restriction for you is your imagination! If you don’t have imagination and / or just need an efficient exponential smoothing function, don’t waste your time with es() and use Rob’s ets().