Categorical Factors for Asset Pricing
25 Pages Posted: 30 Aug 2024
Abstract
Linear models have traditionally been favored for their interpretability in low-dimensional asset pricing problems. However, in today's high-dimensional context, non-linear models, especially machine learning algorithms, often surpass linear regression models in predictive power, albeit at the cost of interpretability due to their inherent black-box nature. This paper introduces and tests a new kind of factor, categorical factors, for asset pricing. A large number of firm characteristics are divided into several groups based on their financial nature and then a categorical factor for each group is extracted using certain feature selection techniques. Based on the data from the US stock market, we demonstrate that these categorical factors outperform raw firm characteristics as predictors for asset pricing, significantly improving the accuracy of linear models while maintaining the performance of machine learning algorithms. Crucially, the best pricing performance is achieved when categorical factors are extracted using non-linear feature selection techniques and then applied in linear models to predict stock returns. This underscores the potential of using categorical factors to integrates the strengths of both linear and non-linear models, thereby enhancing both the interpretability and predictive accuracy of asset pricing models. Moreover, our findings indicate that the categorical factors are not only highly significant from both pricing and investment perspectives but also exhibit low correlations among themselves, indicating that they capture a broad spectrum of firm characteristics. Finally, we observe a temporal decline in the predictive power of these categorical factors, suggesting an increase in market efficiency within the US stock market.
Keywords: Asset pricing, Machine Learning, Feature selection, Categorical factor, Interpretability
Suggested Citation: Suggested Citation