Measuring information in analyst reports: A machine learning approach
15 Pages Posted: 20 Sep 2021
Date Written: September 16, 2021
How to quantify the informational content of analyst reports? In this short methodological paper, we propose a measure of information contribution (IC), defined in the spirit of Shapley values. We use natural language processing to identify topics for over 90,000 analyst reports for S&P 500 stocks between January 2018 to May 2020. Next, we build the IC measure as the average cosine distance between the topic distribution for a particular report and any subset of competitor reports. A first preliminary finding is that the informational content of reports in "crowded stocks" is 41% lower than for reports in low-coverage stocks. Second, team-authored reports are 36% more informative than individual reports and women-authored reports are 12% more informative than men-authored reports.
Keywords: analyst reports, natural language processing, Shapley value, information
JEL Classification: G11, G24, G40, D83, M41
Suggested Citation: Suggested Citation