Using Natural Language Processing Techniques for Stock Return Predictions
54 Pages Posted: 15 Feb 2020
Date Written: March 7, 2017
Abstract
Our Applied Finance Project aims to develop a framework to determine if financial news headlines have meaningful impact on stock prices. This framework is a novel structure that primarily leverages on existing Natural Language Processing, including Name Entity Recognition, and Global Vector for Word Representation (GloVe) model, before combining them with techniques such as k-means clustering and portfolio optimization. The subsequent study on events with predictive abilities could be of interest to institutional investors.
Starting with 1.8 million financial news headlines obtained from the Internet Archive: Wayback Machine, we successfully identified several events with meaningful post-event drifts. These events include situations where the equities of a firm are oversold, approval is given to a firm, a deal or agreement is signed as well as when an advisor is hired. The out-of-sample information ratios for these events are between a range of -1.02 and 0.76. The events we identified are by no means exhaustive, signifying the potential of our model.
Keywords: NLP, Portfolio Optimization, GloVe, K-Means Clustering, S&P 500, Financial News Headlines
JEL Classification: O33, G14
Suggested Citation: Suggested Citation