Monash University
WickramasuriyaShanika19May2017.pdf (6.91 MB)

Optimal forecasts for hierarchical and grouped time series

Download (6.91 MB)
posted on 2017-05-23, 05:12 authored by Shanika L Wickramasuriya
Large collections of time series often have aggregation constraints due to product or geographical groupings. When the time series disaggregation is unique we refer to it as a hierarchical structure whereas when it involves different paths of disaggregation we refer to it as a grouped structure. In both of these structures, forecasts for the most disaggregated series are usually required to add-up exactly to the forecasts of the aggregated series, a constraint known as aggregation consistency. Common approaches to obtaining a set of aggregate-consistent forecasts include the "bottom-up'', "top-down'' and "middle-out'' approaches. The bottom-up approach forecasts all series at the most disaggregated level and sums the forecasts to form the forecasts for higher levels. In contrast, the top-down approach forecasts the most aggregated series and distributes these forecasts based on some proportions to form the forecasts for the disaggregated series. The middle-out approach is a combination of both bottom-up and top-down approaches.

Considering the drawbacks of these approaches, Hyndman, Ahmed, Athanasopoulos, and Shang (2011) proposed forecasting all disaggregated and aggregated series independently, and then reconciling the forecasts so they become aggregate consistent. Their approach is based on a generalized least squares estimator which required an estimate of the covariance matrix of the reconciliation errors that arise due to aggregate inconsistency. In this thesis I show that this covariance matrix is impossible to estimate in practice due to identifiability conditions.

To overcome this issue, I propose a new forecast reconciliation approach that minimizes the mean squared error of aggregate-consistent forecasts across the entire collection of time series under the assumption of unbiasedness. The minimization problem has a closed form solution similar to that of Hyndman et al. (2011), except that the required covariance matrix is estimable leading to a feasible solution. I make this solution scalable by providing a computationally less demanding alternative representation. I refer to this approach as MinT.

I evaluate the performance of MinT compared to alternative approaches in the literature using a series of simulation designs that take into account various features of the collection of time series. This is followed by an empirical application using Australian domestic tourism data. The results indicate that MinT performs well with both simulated and real data.

Many of the applications encountered in practice have fewer observations per series than the number of series in the structure. In such situations, the sample variance covariance matrix is known to perform poorly which makes the estimation of the covariance matrix challenging and requires the use of special strategies. I mainly focused on estimators that belong to the Stein-type shrinkage, sparse, and low rank + sparse classes. For applications with a hierarchical structure, it is reasonable to assume that the covariance matrix is a representation of some lower dimension; for example, a block diagonal structure where the series with the same parental node tend to behave more similarly than the series that are less closely related. Hence, I consider modifications for some of these estimators to incorporate prior knowledge about important relationships resulting from the aggregation constraints. In addition, I explore several possibilities of reducing the computational burden of low rank + sparse type of estimators.

I evaluate the performances of these competing estimators using a series of simulation designs and several empirical applications using Australian domestic tourism data. The results indicate that most of the covariance estimators contribute to substantially improving the forecast accuracy of MinT. In addition, I observe that prior knowledge of the correlation structure does not play a key role in improving the forecast accuracy. One of the estimators that belongs to low rank + sparse category tends to behave unexpectedly in both simulation and empirical evaluations. Through a case study I identify the causes and propose remedies for obtaining an estimator that produces consistently better performance.

MinT and its variants can result in negative values for the reconciled forecasts even when the original set of forecasts are non-negative. This should be avoided in certain applications that are inherently non-negative in nature, especially when the forecasts are used for decisions or policy implementation processes. Hence, MinT is reformulated as a least squares minimization problem with non-negativity constraints. Considering the dimension and sparsity of the matrices involved and the alternative representation of MinT, this problem is solved using three algorithms: block principal pivoting, projected conjugate gradient and scaled gradient projection. I compare the performances of these algorithms using a series of simulations and an empirical application. The results reveal that the block principal pivoting algorithm outperforms the rest where a structure with nearly six hundred thousand series can be reconciled in less than one minute using a standard desktop computer. In addition, it is observed that the gains or losses in forecast accuracy due to non-negativity constraints are negligible


Campus location


Principal supervisor

Rob John Hyndman

Additional supervisor 1

George Athanasopoulos

Year of Award


Department, School or Centre

Econometrics and Business Statistics


Doctor of Philosophy

Degree Type



Faculty of Business and Economics