# Identifying the unknown lag order in an autoregressive model in the possible presence of a single structural break

thesis

posted on 02.03.2017, 00:05 by Halder, SouveekMost economic data are time series in nature and one of the popular methods used to model the time series data is by fitting the autoregressive (AR) model. However, in most cases, the true lag order, denoted by p, remains unknown. However, determining the lag order is important because further analysis of the time series depends on the lag order of the AR model. When the time series spans a long time period, sometimes external factors can affect the structure or the generating process of the time series, which may result in structural breaks. Therefore, ensuring that the correct model is selected involves dealing with (a) the possible presence of structural breaks, and (b) precise mathematical form of the data generating process in each regime which we define as the period between two consecutive structural breaks.
The presence of structural breaks is usually confirmed using a statistical test, such as the Cumulative Sum (CUSUM) test. By contrast, the lag length of an AR process is selected using an information criterion, such as the Akaike Information Criterion (AIC). In empirical studies, these two methods are applied sequentially. Therefore, a natural question that arises is, how does such a sequential statistical method perform? The performance of just the CUSUM test under the assumption that the lag length is correctly specified, and similarly that of the AIC under the assumption that there are no structural breaks, have been studied extensively. However, the performance of the aforementioned sequential statistical method has been studied only in a very small number of studies. This highlights an important gap in the literature. The objective of this thesis is to contribute to this important gap by evaluating the foregoing statistical method using a simulation study.
In order to ensure that our research task is manageable, we restrict to the case when there is at most one structural break, and the data generating process is Autoregressive(AR) before and after the break. Without loss of generality, we assume that the lag length of the AR process is the same before and after, because we may choose the lag length as the maximum of the two lag lengths.
Apart fromthe use of AIC and the related Bayesian Information Criterion (BIC) for choosing the lag length, another approach that has been studied in the recent literature involves the so called, Description Length. The method is to choose the model that has the minimum description length (MDL). Another significant gap in the current literature is that, the relative performance of AIC, BIC and MDL has been studied only in a few published papers.
This thesis evaluates, using a simulation study, how the model selection criteria based on AIC, BIC and MDL perform when they are used in conjunction with CUSUM tests for structural breaks. Thus, this thesis addresses a research question that is important for applied econometrics. Our simulation study makes some new qualitative observations that will have implications for the way that the foregoing model selection criteria and statistical tests are used in combination.