This book is in Open Review. I want your feedback to make the book better for you and other readers. To add your annotation, select some text and then click the on the pop-up menu. To see the annotations of others, click the button in the upper right hand corner of the page

3.2 Classical Seasonal Decomposition

3.2.1 How to do?

One of the classical textbook methods for decomposing the time series into unobservable components is called “Classical Seasonal Decomposition” (Warren M. Persons, 1919). It assumes either a pure additive or pure multiplicative model, is done using centred moving averages and is focused on approximation, not on forecasting. The idea of the method can be summarised in the following steps:

  1. Decide, which of the models to use based on the type of seasonality in the data: additive (3.1) or multiplicative (3.2)
  2. Smooth the data using a centred moving average (CMA) of order equal to the periodicity of the data \(m\). If \(m\) is the an number then the formula is: \[\begin{equation} d_t = \frac{1}{m}\sum_{i=-(m-1)/2}^{(m-1)/2} y_{t+i}, \tag{3.4} \end{equation}\] which means that, for example, the value on Thursday is the average of values from Monday to Sunday. If \(m\) is an even number then a different weighting scheme is typically used, involving the inclusion of additional an value: \[\begin{equation} d_t = \frac{1}{m}\left(\frac{1}{2}\left(y_{t+(m-1)/2}+y_{t-(m-1)/2}\right) + \sum_{i=-(m-2)/2}^{(m-2)/2} y_{t+i}\right), \tag{3.5} \end{equation}\] which means that we use half of the December of the previous year and half of the December of the current year in order to calculate the centred moving average in June. The values \(d_t\) are placed in the middle of the window going through the series (e.g. on Thursday the average will contain values from Monday to Sunday).

The resulting series is deseasonalised. When we average e.g. sales in a year we automatically remove the potential seasonality, which can be observed individually in each month. A drawback of using CMA is that we inevitably lose \(\frac{m}{2}\) observations at the beginning and the end of the series.

In R, the ma() function from the forecast package implements CMA.

  1. De-trend the data:
  • For the additive decomposition this is done using: \({y^\prime}_t = y_t - d_t\);
  • For the multiplicative decomposition, it is: \({y^\prime}_t = \frac{y_t}{d_t}\);
  1. If the data is seasonal, then the average value for each period is calculated based on the de-trended series. e.g. we produce average seasonal indices for each January, February, etc. This will give us the set of seasonal indices \(s_t\);
  2. Calculate the residuals based on what you assume in the model:
  • additive seasonality: \(e_t = y_t - d_t - s_t\);
  • multiplicative seasonality: \(e_t = \frac{y_t}{d_t s_t}\);
  • no seasonality: \(e_t = {y^\prime}_t\).

Note that the functions in R typically allow you to select between additive and multiplicative seasonality. There is no option for “none” and so even if the data is not seasonal you will nonetheless get values for \(s_t\) in the output. Also, notice that the classical decomposition assumes that there is a deseasonalised series \(d_t\) but does not make any further split of this variable into level \(l_t\) and trend \(b_t\).

3.2.2 A couple of examples

An example of the classical decomposition in R is the decompose() function from stats package. Here is an example with pure multiplicative model and AirPassengers data:

ourDecomposition <- decompose(AirPassengers,
                              type="multiplicative")
plot(ourDecomposition)

We can see that the function has smoothed the original series and produced the seasonal indices. Note that the trend component has gaps at the beginning and at the end. This is because the method relies on CMA (see above). Note also that the error term still contains some seasonal elements, which is a downside of such a simple decomposition procedure. However, the lack of precision in this method is compensated by the simplicity and speed of calculation. Note again that the trend component in decompose() function is in fact \(d_t = l_{t}+b_{t}\).

Here is an example of decomposition of the non-seasonal data (we assume pure additive model in this example):

y <- ts(c(1:100)+rnorm(100,0,10),frequency=12)
ourDecomposition <- decompose(y, type="additive")
plot(ourDecomposition)

As you can see, the original data is not seasonal but the decomposition assumes that it is and proceeds with the default approach returning a seasonal component. You get what you ask for.

3.2.3 Other techniques

There are other techniques that decompose series into error, trend and seasonal components but make different assumptions about each component. The general procedure, however, always remains the same: (1) smooth the original series, (2) extract the seasonal components, (3) smooth them out. The methods differ in the smoother they use (LOESS, e.g., uses a bisquare function instead of CMA) and in some cases multiple rounds of smoothing are performed to make sure that the components are split correctly.

There are many functions in R that implement seasonal decomposition. Here is a small selection:

  • decomp() from the tsutils package does classical decomposition and fills in the tail and head of the smoothed trend with forecasts from exponential smoothing;
  • stl() from the stats package uses a different approach - seasonal decomposition via LOESS. It is an iterative algorithm that smoothes the states and allows them to evolve over time. So, for example, the seasonal component in STL can change;
  • mstl() from the forecast package does the STL for data with several seasonalities;
  • msdecompose() from the smooth package does a classical decomposition for multiple seasonal series.

3.2.4 “Why bother?”

“Why decompose?” you may wonder at this point. Understanding the idea behind decompositions and how to perform them helps in understanding ETS, which relies on it. From a practical point of view it can be useful if you want to see if there is a trend in the data and whether the residuals contain outliers or not. It will not show you if the data is seasonal as the seasonality is assumed in the decomposition (I stress this because many students think otherwise). Additionally, when seasonality cannot be added to the model under consideration decomposing the series, predicting the trend and then reseasonalising can be a viable solution. Finally, the values from the decomposition can be used as starting points for the estimation of components in ETS or other dynamic models relying on the error-trend-seasonality.

References

• Warren M. Persons, 1919. General Considerations and Assumptions. The Review of Economics and Statistics. 1, 5–107. https://doi.org/10.2307/1928754