"Big Data" on the Big Screen: Uncovering latent coherence among movies and its effect on box office performance
34 Pages Posted: 4 Oct 2016 Last revised: 18 Sep 2017
Date Written: September 15, 2017
This paper uses "big data" in the movie industry setting to uncover a novel factor, stemming from consumer preferences, and show that it matters for movie box office performance. Recent technological advances, which have led to widespread digitization and an unprecedented amount of consumer data, provide an opportunity to observe consumer behavior in a way that was not previously possible. Using publicly available data from Amazon Video on individuals’ movie rental behavior, I construct a measure, Latent Coherence, to capture similarity between movies based on the extent to which they are frequently rented together by the same individuals. I show that movies that are similar to others (i.e. frequently rented together with coherent set of others), having a high measure of Latent Coherence, outperform those with a low measure at the box office. Then, I seek to disentangle the causality of the relationship between the measure and performance, in order to understand whether Latent Coherence is an inherent movie attribute that affects performance. Although the measure is constructed on data that occur post-launch and post-box office (since consumer movie viewing behavior is observable on Amazon Video), I claim that Latent Coherence is a feature that is relevant before launch. My empirical strategy, combining instrumental variable and control function techniques, provides evidence against two of the most plausible alternative mechanisms that could explain the relationship between Latent Coherence and performance: reverse causality, and movie quality as an unobservable common cause. These findings therefore support my interpretation of this measure as an inherent movie attribute that is relevant before launch and that matters for movie performance.
Keywords: Movies, Big Data, Consumer Preferences
Suggested Citation: Suggested Citation