Influencing Elections with Statistics: Targeting Voters with Logistic Regression Trees

27 Pages Posted: 6 Mar 2012

See all articles by Thomas Rusch

Thomas Rusch

Vienna University of Economics and Business

Ilro Lee

UNSW Business School

Kurt Hornik

Vienna University of Economics and Business

Wolfgang Jank

University of Maryland - Decision and Information Technologies Department

Achim Zeileis

University of Innsbruck

Date Written: March 6, 2012

Abstract

Political campaigning has become a multi-million dollar business. A substantial proportion of a campaign's budget is spent on voter targeting, i.e. to identify and influence as many voters as possible to vote. Based on data, campaigns use statistical tools to provide a basis for decision on whom to target. While the data available is usually rich, campaigns have relied on a rather limited selection, often including only previous voting behavior and one or two demographical variables. State-of-the-art statistical procedures that are used in voter targeting include logistic regression or simple tree methods like CHAID, but there is a growing interest in modern data mining approaches. Along the lines of the latter development, we propose a novel modern framework to approach voter targeting, "Logistic Regression Trees" (LORET). LORET are trees (which may just be a single root node) containing logistic regressions (which may just have an intercept) in every leaf. Thus, they contain logistic regression and classification trees as special cases but also allow for a synthesis of both techniques. We explore various flavors of LORET that employ (a) either a reduced or the full set of available variables and (b) structures these variables into regressors in the logistic model components and/or partitioning variables in the tree components. To assess and illustrate, we apply these LORET versions to a data set of 19,634 possible voters from the 2004 US presidential election. We find that employing more predictor variables clearly improves predictive accuracy, with the best results for methods that employ tree induction. Ineligible models are built by LORET with the reduced set of variables as regressors in each leaf. Furthermore, the synthesis of logistic regression and trees leads to models that have low overall cost for high benefit of convincing non-voters to turn out.

Suggested Citation

Rusch, Thomas and Lee, Ilro and Hornik, Kurt and Jank, Wolfgang and Zeileis, Achim, Influencing Elections with Statistics: Targeting Voters with Logistic Regression Trees (March 6, 2012). Available at SSRN: https://ssrn.com/abstract=2016956 or http://dx.doi.org/10.2139/ssrn.2016956

Thomas Rusch

Vienna University of Economics and Business ( email )

Welthandelsplatz 1
1020

Ilro Lee

UNSW Business School ( email )

UNSW Business School
High St
Sydney, NSW 2052
Australia

Kurt Hornik

Vienna University of Economics and Business ( email )

Welthandelsplatz 1
1020

Wolfgang Jank (Contact Author)

University of Maryland - Decision and Information Technologies Department ( email )

Robert H. Smith School of Business
4300 Van Munching Hall
College Park, MD 20742
United States
301-405-1118 (Phone)

HOME PAGE: http://www.smith.umd.edu/faculty/wjank/

Achim Zeileis

University of Innsbruck ( email )

Here is the Coronavirus
related research on SSRN

Paper statistics

Downloads
211
Abstract Views
1,349
rank
151,965
PlumX Metrics