The More the Merrier? A Machine Learning Algorithm for Optimal Pooling of Panel Data

22 Pages Posted: 24 Apr 2020

See all articles by Marijn Bolhuis

Marijn Bolhuis

University of Toronto, Department of Economics, Students

Brett Rayner

George Washington University

Date Written: February 2020

Abstract

We leverage insights from machine learning to optimize the trade off between bias and variance when estimating economic models using pooled datasets. Specifically, we develop a simple algorithm that estimates the similarity of economic structures across countries and selects the optimal pool of countries to maximize out-of-sample prediction accuracy of a model. We apply the new alogrithm by nowcasting output growth with a panel of 102 countries and are able to significantly improve forecast accuracy relative to alternative pools. The algortihm improves nowcast performance for advanced economies, as well as emerging market and developing economies, suggesting that machine learning techniques using pooled data could be an important macro tool for many countries.

Keywords: Economic models, Production growth, Developing countries, Emerging markets, Data analysis, Machine learning, GDP growth, forecasts, panel data, pooling., WP, forecast error, DGP, forecast, economic structure, output growth

JEL Classification: C53, C45, E01, C5, L31, F16, C33

Suggested Citation

Bolhuis, Marijn and Rayner, Brett, The More the Merrier? A Machine Learning Algorithm for Optimal Pooling of Panel Data (February 2020). IMF Working Paper No. 20/44, Available at SSRN: https://ssrn.com/abstract=3583406

Marijn Bolhuis (Contact Author)

University of Toronto, Department of Economics, Students ( email )

150 St. George Street
Toronto, Ontario
Canada

Brett Rayner

George Washington University ( email )

2121 I Street NW
Washington, DC 20052
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
96
Abstract Views
630
Rank
709,832
PlumX Metrics