Practical Machine Learning Approach to Capture the Scholar Data Driven Alpha in AI Industry
10 Pages Posted: 13 Jan 2020 Last revised: 14 Jan 2020
Date Written: December 9, 2019
AI technologies are helping more and more companies leverage their resources to expand business, reach higher financial performance and become more valuable for investors. However, it is difficult to capture and predict the impacts of AI technologies on companies’ stock prices through traditional financial factors. Moreover, common information sources such as company’s earnings calls and news are not enough to quantify and predict the actual AI premium for a certain company. In this paper, we utilize scholar data as alternative data for trading strategy development and propose a practical machine learning approach to quantify the AI premium of a company and capture the scholar data driven alpha in the AI industry. First, we collect the scholar data from the Microsoft Academic Graph database, and conduct feature engineering based on AI publication and patent data, such as conference/journal publication counts, patent counts, fields of studies and paper citations. Second, we apply machine learning algorithms to weight and re-balance stocks using the scholar data and traditional financial factors every month, and construct portfolios using the “buy-and-hold-long only” strategy. Finally, we evaluate our factor and portfolio in terms of factor performance and portfolio’s cumulative return. The proposed scholar data driven approach achieves a cumulative return of 1029.1% during our backtesting period, which significantly outperforms the Nasdaq 100 index’s 529.5% and S&P 500’s 222.6%. The traditional financial factors approach only leads to 776.7%, which indicates that our scholar data driven approach is better at capturing investment alpha in AI industry than traditional financial factors.
Keywords: AI technology, scholar data, alternative data, AI in finance, quantitative investment, alpha research
Suggested Citation: Suggested Citation