A Poisson Factorization Topic Model for the Study of Creative Documents (and Their Summaries)
62 Pages Posted: 5 Mar 2019
Date Written: February 13, 2019
We propose a topic model tailored to the study of creative documents (e.g., academic papers, movie scripts). We extend Poisson Factorization in two ways. First, the creativity literature emphasizes the importance of novelty in creative industries. Accordingly, we introduce a set of residual topics that capture the portion of each document that is not explained by a combination of common topics. Second, creative documents are typically accompanied by summaries (e.g., abstracts, synopses).\ Accordingly, we jointly model the content of creative documents and their summaries, and capture systematic variations in topic intensities between the documents and their summaries. We\ validate and illustrate the model in three domains:\ marketing academic papers, movie scripts, and TV show closed captions. We illustrate how the joint modeling of documents and summaries provides some insight into how humans summarize creative documents, and enhances our understanding of the significance of each topic. We show that our model produces new measures of novelty which can inform the perennial debate on the relation between novelty and success in creative industries. Finally, we show how the proposed model may form the basis for decision support tools that assist humans in writing summaries of creative documents.
Keywords: Topic Models, Natural Language Processing, Creativity
JEL Classification: M31, C02
Suggested Citation: Suggested Citation