Time Series Forecasting

Time Series

A time series describes the temporal development of a variable such as sales, stock prices, inventory levels, or even temperature. The observation periods of a time series are regular: the values are recorded annually, monthly, daily, etc. Time series serve as the basis for analyzing past values and also for forecasting future developments. Learn more about different types of time series

Granularity

The granularity of a time series determines the temporal frequency at which observations are measured over time. Monthly granularity occurs when one observation is recorded per month. Daily granularity refers to when observations are recorded daily.

Seasonality

Seasonal time series on a monthly basis with a season length of 12 months. It can be observed that the first half of the year (Jan-Jun) is generally stronger than the second half (Jul-Dec), with a pronounced dip in December.

Seasonality refers to a typical structural component in time series: seasonality occurs when cyclical, repeating patterns are found in the time series. The length of the period after which these seasonal patterns repeat is referred to as the season length. Monthly data, for example, often exhibit seasonality with a season length of 12 months.

Trend

In addition to seasonality, the trend of a time series is another important structural characteristic: a trend exists when there is a clear directional development within the time series, either upward (positive growth) or downward (negative growth). Different types of trends can be modeled: linear trend, parabolic trend, exponential trend, etc. In practice, the focus is often on linear trends and dampened linear trends. While a linear trend extrapolates a recognized trend linearly into the future, a dampened trend models trend saturation, where the trend weakens over time.

Indicator

To model a time series, in addition to intrinsic structural components like trend and seasonality, external context information and influencing factors are often relevant. If an external influencing factor provides relevant information with a temporal lead, it is referred to as an indicator. An indicator anticipates future developments in the time series to be forecasted. The associated temporal shift between the indicator and the time series to be forecasted is called the lag of the indicator. The forecast horizon of a forecasting model with a leading indicator typically corresponds to the lag of the indicator.

Time series with influencing factor and lagged influencing factor — Shifting the external influencing factor three months into the future reveals that upward and downward movements in the (lagged influencing factor) closely align with those in the time series to be forecasted (black). With a three-month lead, the influencing factor anticipates expected developments in the time series and thus provides added value for forecasting. In a forecasting procedure, the influencing factor can therefore be effectively used as an indicator with a lag of 3.

Outliers

Monthly time series with an outlier in the seasonal pattern — The usual weak December value of the time series in 2008 is significantly higher than in other years — an outlier in the seasonal pattern.

Outliers are extraordinary values in the time series history that disrupt the usual structure of the series. Outliers may arise from exceptional situations or data errors. Such exceptional values can distort the model estimation and therefore the forecasts. It is important to identify outliers based on the data and, if necessary, replace them with an appropriate substitute value.

Classification of Time Series

Time series can differ significantly in quality. Different types of time series require different forecasting methods to detect existing patterns in the historical data and appropriately account for them in the forecast. Therefore, it is advisable to first analyze the time series type at hand and make an initial selection of suitable methods that can later be examined in more detail and potentially ranked. A simple classification based on Syntetos, Boylan & Croston (2005) can be made as follows:

intermittent: many zero values or zero intervals; sporadic distribution of non-zero values over time with low volatility.
lumpy: many zero values or zero intervals; sporadic distribution of non-zero values over time with high volatility.
erratic: few to no zero values; high volatility in the values.
smooth: few to no zero values; low volatility in the values.

In addition, there are other "trivial" types where no patterns can typically be detected:

few observations: extremely few non-zero values.
(almost) constant: a value that remains (almost) constant over time.
very short: very few values or an extremely short data history.

Autocorrelation

Autocorrelation means that a time-varying quantity is correlated with itself, shifted by a fixed time unit. For example, the maximum temperatures of a day are positively autocorrelated with the maximum temperatures of the previous day. A very hot day is often followed by another day with a similarly high temperature.

Autocorrelation plot of a time series — The determined correlation values are plotted against the respective time shift (lag) in a graph. The relatively high correlation value at lag 1 indicates that the individual time series values are not independent of each other. Each value always reacts to the preceding value.

Cross-correlation

Cross-correlation measures how strongly two time series correlate with each other at different time shifts. It essentially examines at which time offset the two time series align best. It is important to note that trends and seasonality in the time series can distort the results and should potentially be removed beforehand.

Cross-correlation plot of a time series with an influencing factor — The determined correlation values are plotted against the respective time shift (lag) in a graph. It can be seen that the highest correlation between the two factors occurs with a three-month shift into the future.

Forecasting

(Point) forecasts are estimates for the values of a time series for a future period. Statistical forecasts are made using so-called forecasting models.

Plot of a time series with forecast for the next 12 months

Forecast Period / Forecast Horizon

The forecast period (also called the forecast horizon) refers to the period for which predictions are to be made, e.g., 5 days, 6 months, 7 years.

Plot of a time series with forecast over a forecast period of 12 months

Forecast Step

The forecast period describes the length of the future time for which forecasts are provided. Depending on the granularity of the time series, different numbers of forecast steps are needed. For example, for a monthly time series, a forecast period of twelve months will involve twelve forecast steps. For a daily time series, the same period of twelve months will involve about 12 x 30 = 360 forecast steps. As the number of forecast steps increases, the uncertainty typically rises, and thus, the quality of the forecast decreases.

Prediction Interval

A (point) forecast will rarely hit the future actual value exactly. The forecast is always associated with some degree of uncertainty. This uncertainty can be quantified using a prediction interval. The prediction interval describes a range of values around the statistical point forecast, within which the actual future value will fall with a given probability, the prediction confidence level.

Prediction Confidence Level

To quantify the uncertainty of a statistical forecast, the forecast value is accompanied by a prediction interval and an associated prediction confidence level. The confidence level for the interval measures the accuracy of the prediction interval, indicating how likely it is that the interval will cover the future value. The higher the prediction confidence level, the more likely the interval will cover the future value. For example, a prediction confidence level of 95% means that out of 100 prediction intervals calculated in a specific way, 95 will contain the true (future) values of the time series. In approximately 5% of the cases, however, the true values will lie outside of the intervals.

Plot of forecasts with prediction interval for 95% confidence level

Forecasting Method

A forecasting method refers to a data-based procedure used to identify certain structures in a time series and make them useful for a forecast. It essentially defines the rules for creating the forecast and estimates the most appropriate forecasting model for the given time series. There are various classical statistical forecasting methods. Regression or machine learning techniques can also be used as forecasting methods. Different methods focus on different structural components of the time series (trend, seasonality, external influences, adaptability, etc.). Important examples of statistical forecasting methods include the moving average and exponential smoothing. Finding the most suitable method for a given time series, with appropriate settings, is the subject of model selection.

Naive Forecast

The naive forecast is a simple, intuitive forecasting method. For the forecast, the most recent value of the given time series is projected forward into the future as a constant.

Moving Average

The moving average is a simple, intuitive forecasting method. To create a forecast, the arithmetic mean of current data points from the given time series (e.g., from all data points of the last quarter) is calculated and projected forward into the future as a constant. The number of current data points to be included in the averaging is referred to as the order of the moving average, and it must be defined a priori. A special case of the moving average is the naive forecast (moving average of order 1).

Autoregressive Integrated Moving Average (ARIMA)

An ARIMA model (ARIMA = Autoregressive Integrated Moving Average) is a model for the analysis and forecasting of time series, which includes past values of the time series itself as well as past error terms. The analysis can be performed on raw data or (multiple) differentiated data. Seasonality and exogenous factors can also be included in ARIMA models.

Exponential Smoothing with Covariates (ESCov)

Exponential smoothing is a well-established method for the analysis and forecasting of time series, which can take into account level, trend, and multiple seasonal components. In this method, earlier time series values are usually weighted less than the recent history. The extension "Exponential Smoothing with Covariates" (ESCov) can additionally handle exogenous influences.

TBATS

TBATS is an extension of exponential smoothing developed by De Livera, Hyndman & Snyder (2011), which is particularly advantageous for complex seasonal patterns. Seasonality modeling is carried out using Fourier analysis and trigonometric functions. The name TBATS is an acronym that summarizes the capabilities of the method: trigonometric functions for modeling multiple seasonality (T), Box-Cox transformation (B), ARMA error modeling (A), trend (T), and seasonality (S).

Croston Method (Croston)

The Croston method is a forecasting technique for intermittent time series. The method was proposed by Croston in 1972 for forecasting sporadic demand for items. The method separately models the size of demand events (= non-zero values of the time series) and the duration between two consecutive events (= zero intervals), usually using exponential smoothing, and then derives a forecast from this.

In addition to the classic version of Croston, there are now several extensions and variations of the method, such as the Teunter-Syntetos-Babai method (TSB).

Teunter-Syntetos-Babai Method (TSB)

TSB is an extension of Croston developed by Teunter, Syntetos, and Babai (2011), which addresses and overcomes two disadvantageous aspects of the original version:

Positive bias in the forecasts
Inertia in demand events that are ending (Obsolescence)

Essentially, this is achieved by TSB shifting from modeling the duration between two events (zero intervals) to modeling the probabilities of their occurrence.

Model Selection and Validation

Covariate / Influencing Factor

A covariate in a (statistical) forecasting model refers to an influencing factor that acts as a predictor, i.e., it potentially influences the dependent variable being forecasted and is therefore considered in the forecasting model. For example, the daily maximum temperature might be a covariate for modeling and forecasting the daily electricity consumption of a city.

Forecast Error

A forecast error refers to the difference between the predicted and the actual value that occurred.

Comparison of forecasts and actual values in time series forecasting

Goodness Measure

To evaluate the quality of a model, different goodness measures can be constructed or applied. Most of these goodness measures are based on an evaluation of the forecast errors. Examples of such criteria include MAE (Mean Absolute Error), MAPE (Mean Absolute Percentage Error), MSE (Mean Squared Error), and PIS (Periods in Stock). Learn more about forecast errors and goodness measures/performance metrics

Backtesting

Backtesting refers to a strategy for evaluating the quality of a forecast model. In this process, the model's forecasts for a past period (e.g., the past year) are simulated. These are then compared with the already known actual values for that period. Learn more about backtesting

Model Selection

In model selection, the most appropriate forecast model for a given time series is automatically identified, and the corresponding model parameters are optimally adjusted. Learn more about model selection

Ensemble Methods

Ensemble methods combine the individual forecasts of various base models (e.g., ARIMA, exponential smoothing, ...) into a single overall forecast. The core idea behind an ensemble is that by combining the different models, individual tendencies are balanced out, leading to a forecast with higher quality. The selection and weighting of the base models for the ensemble can be done based on the results of each model from the backtesting.

Aggregation

Hierarchical Aggregation

In many cases, time series are hierarchically organized or can be grouped and aggregated into different levels using context attributes. For example, when looking at monthly sales data for items, the total sales of all items, the total sales of all items per region, the sales of each individual item, or even the sales of each individual item per customer might be considered.

When identifying an optimal aggregation level for modeling and forecasting, the specific application goal plays a central role, as well as the question of which level allows the best identification and learning of patterns, structures, and relationships in the data.

With hierarchical forecasts, multiple hierarchical levels can be linked, and consistent forecasts across these levels can be generated. Learn more about hierarchical aggregation

Temporal Aggregation

Through temporal aggregation, a time series is transformed into a new time series with coarser granularity. For example, a monthly time series of monthly sales can be aggregated into a yearly time series of annual sales by summing the twelve monthly sales. In this example, summation is used as the aggregation function; depending on the specific question, other functions, such as the mean, median, or maximum, might be considered.

For forecasting, it is usually appropriate to choose the granularity that corresponds to the forecasting goal. If the goal is to forecast the sales for the next few months, the monthly time series of monthly sales should be used as the data basis. The alternative approach of forecasting daily sales based on daily data and then aggregating the forecasts temporally to obtain monthly sales predictions usually leads to less accurate forecasts. The same holds true for the calculation of monthly sales forecasts from a predicted annual sales total (by dividing by the number of months). However, for long forecast horizons, the usual monthly-based forecasts can often be improved by combining them with the latter approach. Learn more about temporal aggregation