Defining Geographic Markets from Probabilistic Clusters: A Machine Learning Algorithm Applied to Supermarket Scanner Data

55 Pages Posted: 1 Oct 2019

See all articles by Stephen Bruestle

Stephen Bruestle

Federal Maritime Commission

Luca Pappalardo

Institute of Information Science and Technologies (ISTI), Consiglio Nazionale delle Ricerche (CNR)

Riccardo Guidotti

Institute of Information Science and Technologies (ISTI), Consiglio Nazionale delle Ricerche (CNR)

Date Written: September 19, 2019

Abstract

We propose that we estimate geographic markets in two steps. First, estimate clusters of transactions interchangeable in use. Second, estimate markets from these clusters. We argue that these clusters are subsets of markets. We draw on both antitrust cases and economic intuition. We model and estimate these clusters using techniques from machine learning and data science. WE model these clusters using Blei et al.’s (2003) Latent Dirichlet Allocation (LDA) model. And, we estimate this model using Griffiths and Steyvers’s (2004) Gibbs Sampling algorithm (Gibbs LDA). We apply these ideas to a real-world example. We use transaction-level scanner data from the largest supermarket franchise in Italy. We find fourteen clusters. We present strong evidence that LDA fits the data. This shows that these interchangeability clusters exist in the marketplace. Then, we compare Gibbs LDA clusters with clusters from the Elzinga-Hogarty (E-H) test. We find similar clusters. LDA has a few identifiable parameters. The E-H test has too many parameters for identification. Also, Gibbs LDA avoids the silent majority fallacy of the E-H test. Then, we estimate markets from the Gibbs LDA clusters. We use consumption overlap and price stationarity tests on the clusters. We find four grocery markets in Tuscany.

Keywords: defining markets, clustering, interchangeable of use, machine learning, Latent Dirichlet Allocation (LDA), Gibbs Sampling (Gibbs LDA), bags of products, Elzinga-Hogarty test, elbow method, sampling methods, consumption overlap, antitrust markets, economics markets, Markov Chain Monte Carlo (MCMC)

JEL Classification: L100, D400, C380, L400, C150

Suggested Citation

Bruestle, Stephen and Pappalardo, Luca and Guidotti, Riccardo, Defining Geographic Markets from Probabilistic Clusters: A Machine Learning Algorithm Applied to Supermarket Scanner Data (September 19, 2019). Available at SSRN: https://ssrn.com/abstract=3452058 or http://dx.doi.org/10.2139/ssrn.3452058

Stephen Bruestle (Contact Author)

Federal Maritime Commission ( email )

800 North Capitol Street, N.W
Washington, DC 20573
United States

HOME PAGE: http://https://www.fmc.gov

Luca Pappalardo

Institute of Information Science and Technologies (ISTI), Consiglio Nazionale delle Ricerche (CNR) ( email )

Via Giuseppe Moruzzi, 1
Pisa
Italy

Riccardo Guidotti

Institute of Information Science and Technologies (ISTI), Consiglio Nazionale delle Ricerche (CNR) ( email )

Via Giuseppe Moruzzi, 1
Pisa
Italy

Register to save articles to
your library

Register

Paper statistics

Downloads
20
Abstract Views
115
PlumX Metrics