Inference with Arbitrary Clustering

34 Pages Posted: 9 Sep 2019

See all articles by Fabrizio Colella

Fabrizio Colella

University of Lausanne - Department of Economics (DEEP); Fondazione Rodolfo Debenedetti

Rafael Lalive

University of Lausanne - Department of Economics (DEEP); IZA Institute of Labor Economics; CESifo (Center for Economic Studies and Ifo Institute for Economic Research)

Seyhun Orcan Sakalli

King's College London - King's Business School

Mathias Thoenig

University of Lausanne; Centre for Economic Policy Research (CEPR)

Abstract

Analyses of spatial or network data are now very common. Yet statistical inference is challenging since unobserved heterogeneity can be correlated across neighboring observational units. We develop an estimator for the variance-covariance matrix (VCV) of OLS and 2SLS that allows for arbitrary dependence of the errors across observations in space or network structure, and across time periods. As a proof of concept, we conduct Monte Carlo simulations in a geospatial setting based on US Metropolitan areas; tests based on our estimator of the VCV asymptotically correctly reject the null hypothesis where conventional inference methods, e.g. those without clusters, or with clusters based on administrative units, reject the null hypothesis too often. We also provide simulations in a network setting based on the IDEAS structure of co-authorship and real life data on scientific performance; the Monte Carlo results again show that our estimator yields inference at the right significance level already in moderately sized samples, and it dominates other commonly used approaches to inference in networks. We provide guidance to the applied researcher with respect to (i) including or not potentially correlated regressors and (ii) choice of cluster bandwidth. Finally we provide a companion statistical package (acreg) enabling users to adjust OLS and 2SLS coefficient's standard errors, accounting for arbitrary dependence.

Keywords: clustering, arbitrary, geospatial data, network data

JEL Classification: C13, C23, C26

Suggested Citation

Colella, Fabrizio and Lalive, Rafael and Sakalli, Seyhun Orcan and Thoenig, Mathias, Inference with Arbitrary Clustering. Available at SSRN: https://ssrn.com/abstract=3449578 or http://dx.doi.org/10.2139/ssrn.3449578

Fabrizio Colella (Contact Author)

University of Lausanne - Department of Economics (DEEP) ( email )

BFSH1
Lausanne, 1015
Switzerland

Fondazione Rodolfo Debenedetti ( email )

Via Roentgen 1,
Room 5.C1-11
Milan, Milano 20136
Italy

Rafael Lalive

University of Lausanne - Department of Economics (DEEP) ( email )

BFSH1
Lausanne, 1015
Switzerland

IZA Institute of Labor Economics

P.O. Box 7240
Bonn, D-53072
Germany

CESifo (Center for Economic Studies and Ifo Institute for Economic Research)

Poschinger Str. 5
Munich, DE-81679
Germany

Seyhun Orcan Sakalli

King's College London - King's Business School

150 Stamford Street
London, SE1 9NH
United Kingdom

Mathias Thoenig

University of Lausanne ( email )

Centre for Economic Policy Research (CEPR)

London
United Kingdom

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
141
Abstract Views
735
rank
260,795
PlumX Metrics