This website does not support Internet Explorer. Please, switch to Edge, Chrome, or Firefox browser to view this page.

Learn about IE support
Best Practices, Knowledge

Forecast quality is about more than just accuracy

Blogs 17 o9brand 2
Published: Reading time: 4 min
Stefan de Kok Principal Data Science Architect
Stefan de KokPrincipal Data Science Architect

Forecast quality

Forecast stability

Forecast desirability



Forecast quality

There are two dominant camps as to how the quality of a forecast should be judged. The traditional camp—originating from and still favored by academia—looks only at the intrinsic qualities of a forecast, such as accuracy and bias. The second camp—rising more recently from industry—objects because academic metrics do not completely reflect the impact on the business. This camp promotes metrics such as Forecast Value Add (FVA) and more exotic correlations to impacts on downstream processes. We forecast not for the purpose of forecasting but to improve the business. They aim to measure that improvement, such as the effect of forecast on inventory levels or expediting.

Here we make the case that while the causes of both camps may be necessary, they are not sufficient in practice. This blog highlights two additional aspects of a forecast that are often ignored but are critical to a business’ forecast, and these vital factors are stability and desirability.

Forecast stability

None of the metrics used to judge the quality of a forecast of either school consider change over time. If we generate a forecast this month, it may be accurate. If we generate another forecast next month, it may also be accurate. But how much did it change? Maybe both had an error of around 20%, but one was positively biased and the other was negatively biased. The change in quantities forecasted could be as significant as 40% in this case. Figure 1 shows an example of two such forecasts with equal error but opposite bias.

Figure 1: Forecast instability

Forecast instability occurs when consecutive forecasts for the same period have dramatically different values. It is not uncommon for forecasts to change upwards of 10% when measured across entire years and entire portfolios from one forecast cycle to the next. Obviously, for individual time series, the changes are even greater. This is a significant contributing factor to the bullwhip effect that businesses experience.

Forecast desirability

While stability can be measured objectively, a subjective aspect of forecast quality is its desirability. When measured at agreed-upon lags, a generated forecast may have all the right metric values but not satisfy the customer’s expectations. Figure 2 below shows a few examples.

Figure 2: The expected forecast by a customer and three common undesirable forecasts

The examples, in this case, are clearly undesirable given the smooth historical pattern even for a short time range. Most automated forecasts will not pick up on seasonality with such brief history and could produce any of the output examples. While understandable—even justifiable—to an expert, the customer is unlikely to agree. Automated forecasts are an area where the risk of overfitting (finding patterns where there are none) and the risk of underfitting (failing to find existing patterns) clash.


The examples included are anecdotal. In practice, the situation is typically not so clear. But the examples highlight that the quality of a forecast should not just be taken at face value. In business, each forecast is but one iteration in an endless cycle of iterations. What came before and what comes after matters, and ignoring that turns most quality assessments into academic exercises, even those that measure impact on the business.

Similarly, not all aspects of quality can be easily measured numerically. Often a visual inspection will expose issues hidden by the numbers. That is not to say it cannot be mitigated by better algorithms. There will always be cases where the customer expects more. Often, they will be right. We, humans, perceive patterns even when they are pure coincidence. Forecast algorithms tend to err on the safe side. For desirability, maybe they should err on the daring side?

Leaders gartner header image

Gartner® Magic Quadrant™ 2023

o9 Solutions recognized as a Leader in the 2023 Gartner® Magic Quadrant™ for Supply Chain Planning Solutions. Download for free the full Magic Quadrant.

I agree to the privacy policy.

About the author

Stefan de Kok Principal Data Science Architect

Stefan de Kok

Principal Data Science Architect

Stefan de Kok consults with o9 Solutions as a Principal Data Science Architect. He has 25 years experience in supply chain planning, optimization and forecasting, across a wide range of industries. As an applied mathematician he works mainly on model design and remediation, algorithm development, and metric design for quality measurement. He is recognized as an innovator on these topics especially in the area of probabilistic planning and forecasting. He is an active blogger and is currently writing a textbook on these topics.


View our related articles, white papers, use cases & videos

news2 min

o9 Opens a New Shanghai Office to Support the Growing Need for Its AI-powered Platform in China and APAC

by o9 Solutions
Shanghai office opening
article4 min

Navigating the Future: 7 Retail Trends Shaping 2024 and Beyond

by Santiago Garcia-Poveda
Retail header
news2 min

o9 Solutions and AWS Expand their Collaboration

by o9 Solutions
O9 solutions and aws collaborate
news2 min

o9 Solutions Empowers Barilla With Integrated Planning Capabilities

by o9 Solutions
O9 pr newsroom barilla
Industry POV

Consumer Electronics: State of Planning 2024

O9 pov mockup new ppt copy 1
White Paper

Transformative IBP - Driving consequential decision-making across functions

O9 ibp white paper directors mockup