Practical Machine Learning Approach to Capture the Scholar Data Driven Alpha in AI Industry

10 Pages Posted: 13 Jan 2020 Last revised: 14 Jan 2020

See all articles by Yunzhe Fang

Yunzhe Fang

Columbia University - Department of Industrial Engineering and Operations Research (IEOR)

Xiao-Yang Liu

Columbia University - Fu Foundation School of Engineering and Applied Science

Hongyang Yang

Columbia University - Department of Statistics

Date Written: December 9, 2019

Abstract

AI technologies are helping more and more companies leverage their resources to expand business, reach higher financial performance and become more valuable for investors. However, it is difficult to capture and predict the impacts of AI technologies on companies’ stock prices through traditional financial factors. Moreover, common information sources such as company’s earnings calls and news are not enough to quantify and predict the actual AI premium for a certain company. In this paper, we utilize scholar data as alternative data for trading strategy development and propose a practical machine learning approach to quantify the AI premium of a company and capture the scholar data driven alpha in the AI industry. First, we collect the scholar data from the Microsoft Academic Graph database, and conduct feature engineering based on AI publication and patent data, such as conference/journal publication counts, patent counts, fields of studies and paper citations. Second, we apply machine learning algorithms to weight and re-balance stocks using the scholar data and traditional financial factors every month, and construct portfolios using the “buy-and-hold-long only” strategy. Finally, we evaluate our factor and portfolio in terms of factor performance and portfolio’s cumulative return. The proposed scholar data driven approach achieves a cumulative return of 1029.1% during our backtesting period, which significantly outperforms the Nasdaq 100 index’s 529.5% and S&P 500’s 222.6%. The traditional financial factors approach only leads to 776.7%, which indicates that our scholar data driven approach is better at capturing investment alpha in AI industry than traditional financial factors.

Keywords: AI technology, scholar data, alternative data, AI in finance, quantitative investment, alpha research

Suggested Citation

Fang, Yunzhe and Liu, Xiao-Yang and Yang, Hongyang, Practical Machine Learning Approach to Capture the Scholar Data Driven Alpha in AI Industry (December 9, 2019). Available at SSRN: https://ssrn.com/abstract=3501239 or http://dx.doi.org/10.2139/ssrn.3501239

Yunzhe Fang

Columbia University - Department of Industrial Engineering and Operations Research (IEOR) ( email )

331 S.W. Mudd Building
500 West 120th Street
New York, NY 10027
United States

Xiao-Yang Liu

Columbia University - Fu Foundation School of Engineering and Applied Science ( email )

New York, NY
United States

Hongyang Yang (Contact Author)

Columbia University - Department of Statistics ( email )

Mail Code 4403
New York, NY 10027
United States

Register to save articles to
your library

Register

Paper statistics

Downloads
34
Abstract Views
232
PlumX Metrics