Opportunities and Challenges: Lessons from Analyzing Terabytes of Scanner Data

36 Pages Posted: 17 Aug 2017 Last revised: 9 Mar 2023

See all articles by Serena Ng

Serena Ng

Columbia University - Columbia Business School, Economics

Multiple version iconThere are 2 versions of this paper

Date Written: August 2017

Abstract

This paper seeks to better understand what makes big data analysis different, what we can and cannot do with existing econometric tools, and what issues need to be dealt with in order to work with the data efficiently. As a case study, I set out to extract any business cycle information that might exist in four terabytes of weekly scanner data. The main challenge is to handle the volume, variety, and characteristics of the data within the constraints of our computing environment. Scalable and efficient algorithms are available to ease the computation burden, but they often have unknown statistical properties and are not designed for the purpose of efficient estimation or optimal inference. As well, economic data have unique characteristics that generic algorithms may not accommodate. There is a need for computationally efficient econometric methods as big data is likely here to stay.

Suggested Citation

Ng, Serena, Opportunities and Challenges: Lessons from Analyzing Terabytes of Scanner Data (August 2017). NBER Working Paper No. w23673, Available at SSRN: https://ssrn.com/abstract=3018332

Serena Ng (Contact Author)

Columbia University - Columbia Business School, Economics ( email )

420 West 118th Street
New York, NY 10027
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
42
Abstract Views
532
PlumX Metrics