Optimal Data Collection for Randomized Control Trials

56 Pages Posted: 9 May 2016

See all articles by Pedro Manuel Carneiro

Pedro Manuel Carneiro

University College London - Department of Economics; IZA Institute of Labor Economics

Sokbae Lee

University College London

Daniel Wilhelm

University College London

Abstract

In a randomized control trial, the precision of an average treatment effect estimator can be improved either by collecting data on additional individuals, or by collecting additional covariates that predict the outcome variable. We propose the use of pre-experimental data such as a census, or a household survey, to inform the choice of both the sample size and the covariates to be collected. Our procedure seeks to minimize the resulting average treatment effect estimator's mean squared error, subject to the researcher's budget constraint. We rely on a modification of an orthogonal greedy algorithm that is conceptually simple and easy to implement in the presence of a large number of potential covariates, and does not require any tuning parameters. In two empirical applications, we show that our procedure can lead to substantial gains of up to 58%, measured either in terms of reductions in data collection costs or in terms of improvements in the precision of the treatment effect estimator.

Keywords: randomized control trials, big data, data collection, optimal survey design, orthogonal greedy algorithm, survey costs

JEL Classification: C55, C81

Suggested Citation

Carneiro, Pedro Manuel and Lee, Sokbae and Wilhelm, Daniel, Optimal Data Collection for Randomized Control Trials. IZA Discussion Paper No. 9908. Available at SSRN: https://ssrn.com/abstract=2776913

Pedro Manuel Carneiro (Contact Author)

University College London - Department of Economics ( email )

Gower Street
London WC1E 6BT, WC1E 6BT
United Kingdom

IZA Institute of Labor Economics

P.O. Box 7240
Bonn, D-53072
Germany

Sokbae Lee

University College London ( email )

Gower Street
London
United Kingdom

Daniel Wilhelm

University College London ( email )

UCL Economics
30 Gordon Street
London, WC1H 0AX
United Kingdom

Register to save articles to
your library

Register

Paper statistics

Downloads
54
Abstract Views
336
rank
372,900
PlumX Metrics