Realised Volatility Forecasting: Machine Learning via Financial Word Embedding
44 Pages Posted: 29 Jul 2021
Date Written: July 28, 2021
We develop FinText, a novel, state-of-the-art, financial word embedding from Dow Jones Newswires Text News Feed Database. Incorporating this word embedding in a machine learning model produces a substantial increase in volatility forecasting performance on days with volatility jumps for 23 NASDAQ stocks from 27 July 2007 to 18 November 2016. A simple ensemble model, combining our word embedding and another machine learning model that uses limit order book data, provides the best forecasting performance for both normal and jump volatility days. Finally, we use Integrated Gradients and SHAP (SHapley Additive exPlanations) to make the results more 'explainable' and the model comparisons more transparent.
Keywords: Realised Volatility Forecasting; Machine Learning; Natural Language Processing; Word Embedding; Explainable AI; Dow Jones Newswires; Big Data
JEL Classification: C22; C45; C51; C53; C55; C58
Suggested Citation: Suggested Citation