New approaches for statistical inference in duration models

2017-01-13T03:26:23Z (GMT) by Ranasinghe Arachchige, Kulan Arunajith
The availability of tick-by-tick data has led researchers to a new era of modeling the transaction of financial trades. Up until the early 90s time series data were aggregated and modeling and estimation were performed upon these aggregated data sets. This would hide a lot of information concerning a particular trade. For example, variation in the price during the day of a particular trade may not be captured if the total volume per day is used for modeling. Also there may be some other exogenous variables affecting trading patterns and they also may not be captured. For example, if the day is the width of the interval concerned, most of the information during the day might be lost due to aggregation of the time series data. One of the main features of these high frequency data is that they are irregularly time-spaced. This feature has challenged researchers as the standard econometric techniques of modeling time series data, especially ones with fixed time intervals, cannot be applied directly. These high frequency data have opened a new path for researchers to model time series data by taking most of the intraday information into account. The auto regressive conditional duration (ACD) models provide a flexible way of modeling these irregularly time-spaced data. In the recent years, the ACD approach has become a very attractive way of modeling high frequency data. There are a number of ACD models available for univariate data, and only a very few for multivariate data in the literature. Further, a few ACD models have constraints imposed on the parameters of the model. There are a number of estimation methods proposed for estimating ACD models and they include : Maximum Likelihood (ML) and Quasi-Maximum Likelihood (QML), which are fully parametric, and semiparametric methods. For example, the fully parametric methods assume that the distributions are correctly specified, while the semiparametric method of Drost & Werker (2004) was proposed for estimating ACD models when the errors are either independent and identically distributed (iid) or non-iid. Given that the specification of the ACD model is correct, it is crucial to identify the correct distributional form for the innovations. As distributional assumptions are very important in the estimation of ACD models, any form of misspecification would lead to inconsistent estimators and hence inaccurate statistical inference and forecasts. Assessing the performances of density forecasts in the context of ACD models is largely unknown. Multivariate generalizations of ACD models are fundamental to our understanding of the relationships between financial market variables such as prices, quotes and volumes. However, it is very difficult to model more than one series of durations because they could have been recorded on very different time scales. The existing generalizations of ACD models to multivariate settings are not straightforward, because they often involve strong assumptions about exogeneity or are very complex models. The objective of this thesis is threefold. We first compare the performances of existing estimation methods for ACD models with iid and non-iid innovations in terms of mean square error (MSE), and show that the semiparametric method of Drost & Werker (2004) performs well compared to ML and QML methods, particularly when the innovations are non-iid. Then, we make some improvements on the existing semiparametric method, by accommodating the constraints imposed on the ACD models parameters. An extensive simulation study conducted in this thesis shows that our proposed estimator performs better overall than the semiparametric estimator of Drost & Werker (2004). Mathematical proofs of the results of new constrained semiparametric estimation method are provided. Secondly, we assess the performances of density forecasts in the standard ACD models. We conducted a comprehensive simulation study to compare parametric and non-parametric density forecasts. Results of the simulation study show that both the nonparametric gamma kernel of Chen (2000b) and the fully parametric gamma distributions perform well in all the cases we considered. Finally, we propose a new methodology for modelling multivariate duration data in a flexible way while imposing less assumptions on exogeneity. Using an empirical application we illustrate the usefulness of the new method for modeling multivariate duration data.