A Simple Experiment to Test the Interaction Effects Hypothesis for Random Forests and Logistic Regression
7 Pages Posted: 22 Dec 2011 Last revised: 2 Jan 2012
Date Written: December 22, 2011
This article discusses how to build logistic regression models which take interactions into effect. The paper discusses the interaction effects hypothesis for ensemble methods which states random forests and ensemble methods dominate logistic regression because they optimal detect and use predictive interaction effects in the data. This hypothesis is supported using a simple intuitive synthetic data experiment. The importance of this fact is that when random forests outperform logistic regression it means there are predictive interaction effects in the variables which can be added to the logistic regression and when random forests do not outperform there are no predictive interaction effects in the data. This is extremely useful for predictive modeling and statistical analysis.
Keywords: random forest, interaction effects, logistic regression
Suggested Citation: Suggested Citation