A Semantic Approach for Estimating Consumer Content Preferences from Online Search Queries
59 Pages Posted: 19 Dec 2015 Last revised: 17 Apr 2019
Date Written: August 7, 2018
We extend Latent Dirichlet Allocation (LDA) by introducing a topic model, Hierarchically Dual Latent Dirichlet Allocation (HDLDA), for contexts in which one type of documents (e.g., search queries) are semantically related to another type of documents (e.g., search results). In the context of online search engines, HDLDA identifies not only topics in short search queries and webpages, but also how the topics in search queries relate to the topics in the corresponding top search results. The output of HDLDA provides a basis for estimating consumers’ content preferences on the fly from their search queries, given a set of assumptions on how consumers translate their content preferences into search queries. We apply HDLDA and explore its use in the estimation of content preferences, in two studies. The first is a lab experiment in which we manipulate participants’ content preferences, and observe the queries they formulate and their browsing behavior, across different product categories. The second is a field study, which allows us to explore whether the content preferences estimated based on HDLDA may be used to explain and predict click-through rates in online search advertising.
Keywords: search engine optimization, search engine marketing, search queries, content preferences, semantic relationships, topic modeling
JEL Classification: M31
Suggested Citation: Suggested Citation