Estimating Log Models: To Transform or Not to Transform?

44 Pages Posted: 9 Mar 2000 Last revised: 4 Sep 2024

See all articles by Willard G. Manning

Willard G. Manning

University of Chicago - Harris School of Public Policy

John Mullahy

University of Wisconsin - Madison - Department of Population Health Sciences; National Bureau of Economic Research (NBER)

Date Written: November 1999

Abstract

Data on health care expenditures, length of stay, utilization of health services, consumption of unhealthy commodities, etc. are typically characterized by: (a) nonnegative outcomes; (b) nontrivial fractions of zero outcomes in the population (and sample); and (c) positively-skewed distributions of the nonzero realizations. Similar data structures are encountered in labor economics as well. This paper provides simulation-based evidence on the finite-sample behavior of two sets of estimators designed to look at the effect of a set of covariates x on the expected outcome, E(y|x), under a range of data problems encountered in every day practice: generalized linear models (GLM), a subset of which can simply be viewed as differentially weighted nonlinear least-squares estimators, and those derived from least-squares estimators for the ln(y). We consider the first- and second- order behavior of these candidate estimators under alternative assumptions on the data generating processes. Our results indicate that the choice of estimator for models of ln(E(x|y)) can have major implications for empirical results if the estimator is not designed to deal with the specific data generating mechanism. Garden-variety statistical problems - skewness, kurtosis, and heteroscedasticity - can lead to an appreciable bias for some estimators or appreciable losses in precision for others.

Suggested Citation

Manning, Willard G. and Mullahy, John, Estimating Log Models: To Transform or Not to Transform? (November 1999). NBER Working Paper No. t0246, Available at SSRN: https://ssrn.com/abstract=196329

Willard G. Manning

University of Chicago - Harris School of Public Policy ( email )

1155 East 60th Street
Chicago, IL 60637
United States
(773) 834-1971 (Phone)
(773) 702-1979 (Fax)

John Mullahy (Contact Author)

University of Wisconsin - Madison - Department of Population Health Sciences ( email )

610 Walnut St
Madison, WI 53726
United States
608-265-5410 (Phone)
608-263-4885 (Fax)

National Bureau of Economic Research (NBER)

1050 Massachusetts Avenue
Cambridge, MA 02138
United States
608-265-5410 (Phone)
608-263-4885 (Fax)