Predicting Returns With Text Data

54 Pages Posted:

See all articles by Zheng Tracy Ke

Zheng Tracy Ke

Harvard University

Bryan T. Kelly

Yale SOM; AQR Capital Management, LLC; National Bureau of Economic Research (NBER)

Dacheng Xiu

University of Chicago - Booth School of Business

Date Written: May 14, 2019

Abstract

We introduce a new text-mining methodology that extracts sentiment information from news articles to predict asset returns. Unlike more common sentiment scores used for stock return prediction (e.g., those sold by commercial vendors or built with dictionary-based methods), our supervised learning framework constructs a sentiment score that is specifically adapted to the problem of return prediction.

Our method proceeds in three steps:

1) isolating a list of sentiment terms via predictive screening,

2) assigning sentiment weights to these words via topic modeling, and

3) aggregating terms into an article-level sentiment score via penalized likelihood.

We derive theoretical guarantees on the accuracy of estimates from our model with minimal assumptions. In our empirical analysis, we text-mine one of the most actively monitored streams of news articles in the financial system—the Dow Jones Newswires—and show that our supervised sentiment model excels at extracting return-predictive signals in this context.

Keywords: Text Mining, Machine Learning, Return Predictability, Sentiment Analysis, Screening, Topic Modeling, Penalized Likelihood

Suggested Citation

Ke, Zheng and Kelly, Bryan T. and Xiu, Dacheng, Predicting Returns With Text Data (May 14, 2019). Available at SSRN: https://ssrn.com/abstract=3388293

Zheng Ke

Harvard University ( email )

1875 Cambridge Street
Cambridge, MA 02138
United States

Bryan T. Kelly (Contact Author)

Yale SOM ( email )

135 Prospect Street
P.O. Box 208200
New Haven, CT 06520-8200
United States

AQR Capital Management, LLC ( email )

Greenwich, CT
United States

National Bureau of Economic Research (NBER) ( email )

1050 Massachusetts Avenue
Cambridge, MA 02138
United States

Dacheng Xiu

University of Chicago - Booth School of Business ( email )

5807 S. Woodlawn Avenue
Chicago, IL 60637
United States

Register to save articles to
your library

Register

Paper statistics

Downloads
81
rank
242,784
Abstract Views
272
PlumX Metrics