SSRN Home Search and Download Papers Browse Abstract and Paper Submission Subscribe to Networks View Briefcase Top Papers Top Authors Top Institutions

 

Abstract

 
 

References (32)

Beta

 


 


Download | Share | Email | Add to Briefcase | Buy Hard Copy

A Default Prior Distribution for Logistic and Other Regression Models

Andrew Gelman
Columbia University - Department of Statistics and Department of Political Science

Aleks Jakulin
Columbia University - Department of Statistics; Institute for Social and Economic Research and Policy

Yu-Sung Su
Applied Statistic Center, Columbia University

Maria Grazia Pittau
Sapienza University of Rome - Department of SPSA


August 3, 2007


Abstract:     
We propose a new prior distribution for classical (non-hierarchical) logistic regression models, constructed by first scaling all nonbinary variables to have mean 0 and standard deviation 0.5, and then placing independent Student-t prior distributions on the coefficients. As a default choice, we recommend the Cauchy distribution with center 0 and scale 2.5, which in the simplest setting is a longer-tailed version of the distribution attained by assuming one-half additional success and one-half additional failure in a logistic regression. We implement a procedure to fit generalized linear models in R with this prior distribution by incorporating an approximate EM algorithm into the usual iteratively weighted least squares. We illustrate with several examples, including a series of logistic regressions predicting voting preferences, an imputation model for a public health data set, and a hierarchical logistic regression in epidemiology.

We recommend this default prior distribution for routine applied use. It has the advantage of always giving answers, even when there is complete separation in logistic regression (a common problem, even when the sample size is large and the number of predictors is small) and also automatically applying more shrinkage to higher-order interactions. This can be useful in routine data analysis as well as in automated procedures such as chained equations for missing-data imputation.

Keywords: Bayesian inference, generalized linear model, least squares, hierarchical model, linear regression, logistic regression, multilevel model, noninformative prior distribution

Working Paper Series

Date posted: September 11, 2007 ; Last revised: September 11, 2007

Suggested Citation

Gelman, Andrew, Jakulin, Aleks, Su, Yu-Sung and Pittau, Maria Grazia, A Default Prior Distribution for Logistic and Other Regression Models (August 3, 2007). Available at SSRN: http://ssrn.com/abstract=1010421


Export to: Export Citation What's this?

Contact Information

Andrew Gelman (Contact Author)
Columbia University - Department of Statistics and Department of Political Science ( email )
New York, NY 10027
United States
212-854-4883 (Phone)
212-663-2454 (Fax)
Aleks Jakulin
Columbia University - Department of Statistics ( email )
Mail Code 4403
New York, NY 10027
United States
Institute for Social and Economic Research and Policy ( email )
Columbia University in the City of New York
420 West 118th Street, 8th Floor, Mail Code 3355
New York City, NY 10027
United States
Maria Grazia Pittau
Sapienza University of Rome - Department of SPSA ( email )
P.le Aldo Moro 5
Rome, RM 00185
Italy
Yu-Sung Su
Applied Statistic Center, Columbia University ( email )
New York, NY 10025
United States
HOME PAGE: http://www.stat.columbia.edu/~yusung
Feedback to SSRN (Beta)


Paper statistics
Abstract Views: 957
Downloads: 178
Download Rank: 47,821
References: 32

© 2009 Social Science Electronic Publishing, Inc. All Rights Reserved. Terms of Use  Privacy Policy
This page was served by apollo3 in 0.125 seconds.