Evidence in Favor of Weight of Evidence and Binning Transformations for Predictive Modeling
39 Pages Posted: 12 Sep 2011
Date Written: September 10, 2011
Weight of Evidence transformation of categorical variables is a technique used by credit scoring professionals for decades. This paper investigates whether using this transformation improves predictive performance. For models without interaction terms the use of Weight of evidence and binning or discretization of numeric variables improves predictive accuracy. The addition of Weight of evidence transformations without binning is marginal in models without interactions. This is consistent with the excellent results achieved on the Paralyzed Veteran Admin KDD 98 data where the best performance was achieved using both WOE and binning.
For models with interaction terms the use of WOE transform improves model performance for 2 out of 3 data sets and performance is the same for the third data set with or without the WOE. WOE tends to improve logistic regression and I* tuned logistic regression performance while degrading random forest performance slightly. WOE and I* algorithm thus reach peak predictive models in achieving area under the curve competitive with winning KDD benchmarks.
The combination of WOE and binning reduces performance for models with interaction terms. This makes sense in retrospect as binning variables results in loss of information about interaction amongst continuous variables.
WOE and binning thus improve model performance when used together when modeling without interactions as thought by practitioners. However when interaction effects exist in the data interaction effects are more predictive than WOE and binning and WOE should be used alone as binning can result in loss of predictive power of interaction effects. Interactions exist in the data set when random forest outperforms logistic regression out of the box (Sharma, 2011b).
Suggested Citation: Suggested Citation