Frankly speaking, I didn’t see the point in discussing MAPE when I wrote recent posts on error measures. However, I’ve received several comments and messages from data scientists and demand planners asking for clarification. So, here it is.
TL;DR: Avoid using MAPE!
MAPE, or Mean Absolute Percentage Error, is a still-very-popular-in-practice error measure, which is calculated by taking the absolute difference between the actual and forecast, dividing it by the actual value:
\begin{equation*}
\mathrm{MAPE} = \mathrm{mean} \left(\frac{actual – forecast}{actual} \right).
\end{equation*}
The rationale is clear: we need to get rid of scale, and we want something that measures accuracy well, is easy to calculate and interpret. Unfortunately, MAPE is none of these things, and here is why.
- It is scale sensitive: if you have sales in thousands of units then the actual value in the denominator will bring the overall measure down and you will have very low number even if the model is not doing well. Similarly, if you deal with very low volumes, they will inflate the measure, making it easily hundreds of percents, even if the model does a very good job.
- It is well known that MAPE prefers when you underforecast (Fildes, 1992). It is not symmetric and might be misleading. BTW, “symmetric” MAPE is not better and not symmetric either (Goodwin & Lawton, 1999).
- It is not easy to calculate on intermittent demand. Technically speaking, you get an infinite value, so it is not possible to calculate it in that case.
- Okay, it is easy to interpret, fair enough. But the value itself does not tell you anything about performance of your model (see point 1 above).
- And it is not clear what it is minimised with (remember this post?).
In fact, anything that has “APE” in it, will have similar issues.
Right. So, how can we fix that?
The main problem of MAPE arises because of the division of the forecast error by the actuals from the holdout sample. If we change the denominator, we solve problems (1) and (2).
Hyndman & Koehler (2006) proposed a solution, taking the Mean Absolute Error (MAE) of forecast and dividing it by the mean absolute differences of the data. The latter step is done purely for scaling reasons, and we end up with something called “MASE” that does not have the issues (1), (2) and (5), but is not easy to interpret.
The problem with MASE is that it is minimised by median and as a result not appropriate for intermittent demand. But there is a good alternative based on the Root Mean Squared Error (RMSE), called RMSSE (Makridakis et al., 2022) that uses the same logic as MASE: take RMSE and divide it by the in-sample Root Mean Squared differences. It is still hard to interpret, but at least it ticks the other four boxes.
If you really need the “interpretation” bit in your error measure, consider dividing MAE/RMSE by the in-sample mean of the data (Petropoulos & Kourentzes, 2015). This might not fix the issue (1) completely, but at least it would solve the other four problems.
If you want to learn more about error measures, check out Section 2.1 of my monograph or read an old post of mine “Naughty APEs and the quest for the holy grail“.
And here is a depiction of Mean APEs, inspired by my old post (thanks to Stephan Kolassa for the idea of the image):