Gremlins in the Data: Identifying the Information Content of Research Subjects
50 Pages Posted: 16 Aug 2017 Last revised: 16 Nov 2017
Date Written: August 8, 2017
Empirical demand functions (based on experimental studies, such as Choice Based Conjoint) are critical to many aspects of marketing, such as targeting and segmentation, setting prices and evaluating the potential of new products. While considerable work has been done on developing approaches for ensuring that research subjects are both honest and engaged, the reduced cost associated with collecting data in an online setting has driven many studies to be collected under conditions which leave researchers unsure of the value of the information content provided by each subject. Objective measures related to how the subject completes the study, such as latency (how quickly answers are given), can only be tied to other objective measures (such as the fit of the model or consistency of the answer) and ultimately have questionable relationship to the subject's utility function.
In response to this problem, we introduce a mixture modeling framework which clusters subjects based on variances in a choice based setting (multinomial logit models). This model naturally groups subjects based on the internal consistency of their answers, where we argue that a higher level of internal consistence (hence lower variance) reflects more engaged consumers who have sufficient experience with the product category and choice task, to have well-formed utilities. This approach provides an automated way of determining which consumers are relevant. We discuss both the modeling framework and illustrate the methods using data from several commercial conjoint studies.
Keywords: Multinomial Logit, Conjoint Analysis, Data Quality, Finite Mixture Models
Suggested Citation: Suggested Citation