Gremlins in the Data: Identifying the Information Content of Research Subjects

51 Pages Posted: 16 Aug 2017 Last revised: 6 Aug 2019

See all articles by John R. Howell

John R. Howell

Brigham Young University - Marriott School of Business

Peter Ebbes

HEC Paris - Marketing

John Liechty

Pennsylvania State University, University Park

Porter Jenkins

Pennsylvania State University, Smeal College of Business, Students

Date Written: August 2019

Abstract

Empirical demand functions (based on experimental studies, such as Choice Based Conjoint) are critical to many aspects of marketing, such as targeting and segmentation, setting prices and evaluating the potential of new products. While considerable work has been done on developing approaches for ensuring that research subjects provide honest and thoughtful responses, the reduced cost associated with collecting data in an online setting has driven many studies to be collected under conditions which leave researchers unsure of the value of the information content provided by each subject. Objective measures related to how the subject completes the study, such as latency (how quickly answers are given), can only be tied to other objective measures (such as the fit of the model or consistency of the answer), but ultimately have a questionable relationship to the subject's utility function.

In response to this problem, we introduce a mixture modeling framework which clusters subjects based on variances in a choice based setting. Our proposed model naturally groups subjects based on the internal consistency of their answers, where we argue that a higher level of internal consistence (hence lower variance) reflects more engaged consumers who have sufficient experience with the product category and choice task. Gremlins, on the other hand, occur when a cluster of respondents behaves such that the noise in their responses overwhelms any signal, leading to a lack of predictive power for these respondents. Our approach provides an automated way of determining which respondents are relevant. We discuss the conceptual and modeling framework and illustrate the method using both simulated data and data from several commercial conjoint studies.

Keywords: Multinomial Logit, Conjoint Analysis, Data Quality, Finite Mixture Models

Suggested Citation

Howell, John R. and Ebbes, Peter and Liechty, John and Jenkins, Porter, Gremlins in the Data: Identifying the Information Content of Research Subjects (August 2019). HEC Paris Research Paper No. MKG-2017-1223, Available at SSRN: https://ssrn.com/abstract=3018425 or http://dx.doi.org/10.2139/ssrn.3018425

John R. Howell

Brigham Young University - Marriott School of Business ( email )

Provo, UT 84602
United States

Peter Ebbes (Contact Author)

HEC Paris - Marketing ( email )

Paris
France

John Liechty

Pennsylvania State University, University Park ( email )

University Park
State College, PA 16802
United States

Porter Jenkins

Pennsylvania State University, Smeal College of Business, Students ( email )

University Park, PA
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
161
Abstract Views
1,666
Rank
293,436
PlumX Metrics