Risk Factors That Matter: Textual Analysis of Risk Disclosures for the Cross-Section of Returns
84 Pages Posted: 20 Jan 2019 Last revised: 24 Sep 2020
Date Written: September 22, 2020
I exploit unsupervised machine learning and natural language processing techniques to elicit the risk factors that firms themselves identify in their annual reports. I quantify the firms' exposure to each identified risk and construct factor mimicking portfolios that proxy for each undiversifiable source of risk. The portfolios are priced in the cross-section and contain information above and beyond the commonly used multi-factor representations. A model that uses only firm identified risk factors (FIRFs) performs at least as well as traditional factor models, despite not using any information from past prices or returns.
Keywords: Cross-Section of Returns, Factor Models, Machine Learning, Big Data, LDA, Text Analysis, NLP
Suggested Citation: Suggested Citation