Big Data Approach to Realised Volatility Forecasting Using HAR Model Augmented With Limit Order Book and News
40 Pages Posted: 12 Sep 2020 Last revised: 21 Jun 2021
Date Written: September 1, 2020
Abstract
The study determines if information extracted from a big data set that includes limit order book (LOB) and Dow Jones corporate news can help to improve realised volatility forecasting for 23 NASDAQ tickers over the sample from 27 July 2007 to 18 November 2016. The out-of-sample forecasting results indicate that the CHAR model outperformed all other models in the HAR-family of models, and there is strong evidence that news and LOB data provide statistically significant improvement in RV forecasts. Specifically, the slope of the bid-side of LOB has better predictive power than the slope from the ask-side. For normal volatility day, the ‘negative’ sentiment derived from the news has a clear impact, while ‘news count’, and to a lesser extent, ‘weak modal’, and ‘uncertainty’ can help to forecast volatility jumps. The depth of the LOB also helps to forecast volatility jumps. Indeed, the findings also suggest normal volatility and volatility jumps should be separately analysed as variables that improve the forecasting performance of normal days causes a degradation in the forecasting performance of volatility jumps and vice versa. On the other hand, increasing the estimation sample size causes statistically significant degradation in the forecasting performance of volatility on normal days, especially if it includes extreme volatility period such as the 2008 financial crisis, but a longer sample improves the forecast of volatility jumps.
Keywords: Realised Volatility Forecasting, Heterogeneous AutoRegressive models, Limit Order Book Data, Dow Jones Corporate News, Big Data
JEL Classification: C22, C51, C53, C55, C58
Suggested Citation: Suggested Citation