Between Data Cleaning and Inference: Pre-Averaging and Robust Estimators of the Efficient Price
51 Pages Posted: 26 Mar 2016 Last revised: 19 Sep 2016
Date Written: February 19, 2016
Abstract
Pre-averaging is a popular strategy for mitigating microstructure in high frequency financial data. As the term suggests, transaction or quote data are averaged over short time periods ranging from 30 seconds to five minutes, and the resulting averages approximate the efficient price process much better than the raw data. Apart from reducing the size of the microstructure, the methodology also helps synchronise data from different securities. The procedure is robust to short term dependence in the noise.
Since averages can be subject to outliers, and since they can pulverise jumps, we have developed a broader theory which also applies to cases where M-estimation is used to pin down the efficient price in local neighbourhoods. M-estimation serves the same function as averaging, but we shall see that it is safer. Good choices of M-estimating function greatly enhance the identification of jumps. The methodology applies off-the-shelf to any high frequency econometric problem.
In this paper, we develop a general theory for pre-averaging and M-estimation based inference. We show that, up to a contiguity adjustment, the pre-averaged process behaves as if one sampled from a semimartingale (with unchanged volatility) plus an independent error.
Estimating the efficient price is a form of pre-processing of the data, and hence the methods in this paper also serve the purpose of data cleaning.
Keywords: consistency, cumulants, contiguity, continuity, discrete observation, efficiency, equivalent martingale measure, high frequency data, jumps, leverage effect, M-estimation, medianisation, microscructure, pre-averaging, realised beta, realised volatility, robust estimation, semi-martingale
JEL Classification: C01, C02, C13, C14, C22, G11
Suggested Citation: Suggested Citation