Clustered Feature Importance (Presentation Slides)

34 Pages Posted: 6 Mar 2020

See all articles by Marcos Lopez de Prado

Marcos Lopez de Prado

Cornell University - Operations Research & Industrial Engineering; True Positive Technologies

Date Written: January 29, 2020

Abstract

A substitution effect takes place when two or more explanatory variables share a substantial amount of information (predictive power).

Under the presence of substitution effects, feature importance methods may not be able to determine robustly which variables are significant.

This presentation discusses the Clustered Feature Importance (CFI) method, which is robust to linear as well as non-linear substitution effects.

Keywords: machine learning, feature importance, permutation importance, mean decrease accuracy

JEL Classification: G0, G1, G2, G15, G24, E44

Suggested Citation

López de Prado, Marcos, Clustered Feature Importance (Presentation Slides) (January 29, 2020). Available at SSRN: https://ssrn.com/abstract=3517595 or http://dx.doi.org/10.2139/ssrn.3517595

Marcos López de Prado (Contact Author)

Cornell University - Operations Research & Industrial Engineering ( email )

237 Rhodes Hall
Ithaca, NY 14853
United States

HOME PAGE: http://www.orie.cornell.edu

True Positive Technologies ( email )

NY
United States

HOME PAGE: http://www.truepositive.com

Here is the Coronavirus
related research on SSRN

Paper statistics

Downloads
1,514
Abstract Views
4,655
rank
12,259
PlumX Metrics