Improving Naïve Bayes Classifiers Using Interaction and Transformations Of Variables
18 Pages Posted: 3 Oct 2011
Date Written: October 2, 2011
An algorithm of generating functionally dependent variable transformations and interactions is used with random forest variable selection in conjunction with naïve bayes is proposed. This algorithm improves performance of naïve bayes substantively and is away for factoring in explicit variables to represent interactions between variables to overcome loss of information due to the conditional independence assumption. Adding interaction effects and transformations helped improve out of sample performance in 3 credit data sets by 5% but still did not match I* performance or logistic regression w tuned interactions. That makes sense as credit data is notorious for correlated variables which violate the independence assumption.
Keywords: naive bayes, random forest, interactions, transformations, classification, I*, logistic regression
Suggested Citation: Suggested Citation