Clustered Feature Importance (Presentation Slides)

35 Pages Posted: 6 Mar 2020 Last revised: 28 May 2020

See all articles by Marcos Lopez de Prado

Marcos Lopez de Prado

Cornell University - Operations Research & Industrial Engineering; Abu Dhabi Investment Authority; True Positive Technologies

Date Written: January 29, 2020

Abstract

A substitution effect takes place when two or more explanatory variables share a substantial amount of information (predictive power).

Under the presence of substitution effects, feature importance methods may not be able to determine robustly which variables are significant.

This presentation discusses the Clustered Feature Importance (CFI) method, which is robust to linear as well as non-linear substitution effects.

Keywords: machine learning, feature importance, permutation importance, mean decrease accuracy

JEL Classification: G0, G1, G2, G15, G24, E44

Suggested Citation

López de Prado, Marcos and López de Prado, Marcos, Clustered Feature Importance (Presentation Slides) (January 29, 2020). Available at SSRN: https://ssrn.com/abstract=3517595 or http://dx.doi.org/10.2139/ssrn.3517595

Marcos López de Prado (Contact Author)

Cornell University - Operations Research & Industrial Engineering ( email )

237 Rhodes Hall
Ithaca, NY 14853
United States

HOME PAGE: http://www.orie.cornell.edu

Abu Dhabi Investment Authority ( email )

211 Corniche Road
Abu Dhabi, Abu Dhabi PO Box3600
United Arab Emirates

HOME PAGE: http://www.adia.ae

True Positive Technologies ( email )

NY
United States

HOME PAGE: http://www.truepositive.com

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
2,752
Abstract Views
8,027
rank
5,862
PlumX Metrics