Sector Categorization Using Gradient Boosted Trees Trained on Fundamental Firm Data

8 Pages Posted: 20 Jun 2019 Last revised: 3 Dec 2019

See all articles by Ming Fang

Ming Fang

Martin Tuchman School of Management, New Jersey Institute of Technology

Lilian Kuo

New Jersey Institute of Technology

Frank Shi

New Jersey Institute of Technology

Stephen Michael Taylor

Stevens Institute of Technology

Date Written: June 13, 2019

Abstract

We examine to what extent the GICS sector categorization of equity securities may be systematically reconstructed from historical quarterly firm fundamental data using gradient boosted tree classification. Model complexity and performance tradeoffs are examined and relative feature importance is described. Potential extensions are outlined including ideas to improve feature engineering, validating internal consistency and integrating additional data sources to further improve classification accuracy.

Keywords: GICS Sector, Gradient Boosted Trees, Fundamental Data, Financial Ratios

JEL Classification: D40, C80

Suggested Citation

Fang, Ming and Kuo, Lilian and Shi, Frank and Taylor, Stephen Michael, Sector Categorization Using Gradient Boosted Trees Trained on Fundamental Firm Data (June 13, 2019). Available at SSRN: https://ssrn.com/abstract=3403818 or http://dx.doi.org/10.2139/ssrn.3403818

Ming Fang

Martin Tuchman School of Management, New Jersey Institute of Technology ( email )

United States

Lilian Kuo

New Jersey Institute of Technology ( email )

University Heights
Newark, NJ 07102
United States

Frank Shi

New Jersey Institute of Technology ( email )

University Heights
Newark, NJ 07102
United States

Stephen Michael Taylor (Contact Author)

Stevens Institute of Technology ( email )

1 Castle Point Terrace
Hoboken, NJ 07030
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
175
Abstract Views
1,214
Rank
309,447
PlumX Metrics