Model selection for a class of conditional heteroscedastic processes

10.4225/03/5a04f615b5fbf Kian Teng Kwek Kian Teng Kwek Model selection for a class of conditional heteroscedastic processes Monash University 2017 Mathematical optimization Monte Carlo method Heteroscedasticity ‡x Mathematical models Autoregression (Statistics) 2017-11-10 00:42:59 Thesis https://bridges.monash.edu/articles/thesis/Model_selection_for_a_class_of_conditional_heteroscedastic_processes/5588875 This thesis investigates which information criteria (IC) is best for choosing a conditional heteroscedastic (CH) model from among rival autoregressive conditional heteroscedastic (ARCH) related models in finite samples. In addition, it also considers the central question: "Can we construct an IC procedure for selection between CH models that is optimal in small samples?" Using Monte Carlo methods, we construct small sample optimal procedures by introducing an optimization principle based onmaximizing the average probabilities of correct model selection (APCMS). In Chapter 3, we consider seven IC procedures based on penalized maximized conditional log-likelihood functions, namely, Akaike's IC (AIC), Schwarz's Bayesian IC (BIC), Hannan and Quinn's (HQIC) criterion, Theil's adjusted R2 criterion or residual variance criterion (RVC), Amemiya's prediction criterion (PC), Hocking's criterion (Sp) and the generalized cross validation criterion (GCV) in order to determine which one is the best IC in finite samples. The simulation study reveals some interesting small sample properties of these criteria, namely, RVC is the best IC for small samples selection problems but as the sample size increases, RVC loses its efficacy to AIC which becomescomparatively a better IC. BIC is clearly the worst IC procedure. Chapter 4 looks at how to construct these penalties in small samples, such that these optimaL-information criteria give good statistical properties. A useful property is the property of consistency, i.e., always choosing the correct model in large samples. Using grid search, the construction of these optimal penalties is based on a iarge Monte Carlo simulation study that involves pre-fixed parameter values for a portfolio of models for four sets of standard ARCH(q) and GARCH(p,q) models with normality assumptions for the underlying innovation process. A response surface model is then fitted to estimate an algebraic expression for the optimal penalty function for the ARCH(q) and GARCH(p,q) models respectively, based on Granger, King and White's (1995) definition of a general IC. These estimated prototype optimal information criteria are generically called Conditional Heteroscedastic IC (CHIC). In particular, three hybrids of CHIC are estimated, namely CHIC-A, CHIC-G and CHIC-P, for ARCH models, GARCH models and pooled ARCH and GARCH models respectively. We generalize Nishii's (1988) consistency results to allow us to demonstrate that the CHIC procedures are consistent. Hitherto, the construction of CHIC procedures is conditional on chosen non random parameter values for the data generating processes (DGPs). An improvement over this method without enforcing the conditioning of these parameter values, is torandomly make drawings from a prior distribution to select these parameter values for the models. In Chapter 5, we introduce a new class of optimal small sample procedures, namely SOP and POP based on unconditional small sample penalties. SOP is derived based on a summation optimization principle, and POP is derived based on a product maximizing method. A comparison between the relative performance of these optimal procedures with existing IC procedures indicates in particular, that SOP is the best procedure. The optimal small sample procedure also allows us to test whether theprobabilities of correct selection employing CHIC procedures are biased towards the pre-fixed parameter values chosen through a multi-point method. CIDC-A and SOP are also checked for the property of robust selection by changing the normality assumptions for the ARCH models in Chapter 6. In the first part of this study, we assumed the error innovations from a scaled Student's t distribution with four degrees of freedom and constructed the conditional log-likelihood function using a Student's t distribution. In the second part of the study, we examined making the wrong assumptions for the error innovations by misspecifying the conditional log likelihood functions. That is when the error was normal, we estimated a conditional log likelihood function for the Student's t distribution, and when the error is non-normal, we specified a normal conditional log-likelihood function. We found that both our optimal procedures, CIDC-A and SOP are efficacious to misspecified log-likelihood assumptions. Grose and King (1993) found that a selection criterion for autoregressive moving-average (ARMA) models can biasedly select a particular model due to the shape of the likelihood function. In Chapter 7, we examined whether our optimal procedure favours a GARCH(l,l) model when ARCH(q) models are the true models and compared its performance with IC procedures. This was done because many empirical studies favour the GARCH(1,1) model. Out optimal procedure suggets that GARCH(1,1) models are not unduly favoured when ARCH models are the true models. In general, the CHIC procedures and the optimal small sample procedures out perform all other existing IC in different model selection problems for choosing small sample ARCH and GARCH models. However, BIC is the worst performing IC as it has a penalty function that penalizes larger models very harshly. This suggests that BIC, which has been popularly applied in many areas of model selection problems, should not be used for problems involving CH models. One reason that BIC did not perform as well as our optimal IC procedures is because it has been built without considering the one sided information in the estimation of these CH models.