Guided Diverse Concept Miner (GDCM): Uncovering Relevant Constructs for Managerial Insights From Text

Information Systems Research

57 Pages Posted: 21 Dec 2018 Last revised: 3 Apr 2024

See all articles by Dokyun Lee

Dokyun Lee

Boston University - Questrom School of Business

Zhaoqi Cheng

Boston University - Questrom School of Business

Chengfeng Mao

Massachusetts Institute of Technology (MIT)

Emaad Manzoor

Cornell University, Ithaca, New York

Date Written: May 20, 2018

Abstract

Guided Diverse Concept Miner (GDCM) is an interpretable deep learning algorithm to (1) automatically extract corpus-level concepts from text data, (2) focus the discovery of concepts to filter through only the concepts highly correlated to the user-specified managerial outcome, and (3) quantify the concept’s correlational importance to the outcome. GDCM is used to explore and potentially extract previously unknown concepts and insights from the text that may explain the managerial outcome, without the need to provide any human-predefined guidance or labeled data on concepts. GDCM embeds words, documents, and concepts all in the same vector space, enabling easy interpretation of discovered concepts by associating words local to the concept vector. GDCM is explicitly configured to increase recovered-concept diversity, coherence, and relevance to managerial outcomes.

We demonstrate GDCM as a “guided exploratory” tool for a hypothetical managerial case involving online purchase journey data connected to consumed reviews. GDCM scalably extracts concepts hidden in customer reviews highly correlated to conversion and provides concept importance in comparison to product ratings. Concepts produced turn out to be product qualities previously theorized to impact conversion in literature and correlational importance gauged by GDCM closely matches estimates from a previous causal study run on a similar dataset, serving as external validations of GDCM as a “guided exploratory” tool. Additional experiments with other data show that extracted insights are sensitive to guiding managerial variables, and sensibly so, further demonstrating the flexibility of GDCM as a managerial tool.

Keywords: Text, Managerial Insight Extraction, Guided Exploration, Concept Extraction, Deep Learning, Interpretable Machine Learning

JEL Classification: C38, C39, M31, M39

Suggested Citation

Lee, Dokyun and Cheng, Zhaoqi and Mao, Chengfeng and Manzoor, Emaad, Guided Diverse Concept Miner (GDCM): Uncovering Relevant Constructs for Managerial Insights From Text (May 20, 2018). Information Systems Research, Available at SSRN: https://ssrn.com/abstract=3304756 or http://dx.doi.org/10.2139/ssrn.3304756

Dokyun Lee (Contact Author)

Boston University - Questrom School of Business ( email )

595 Commonwealth Avenue
Boston, MA MA 02215
United States

Zhaoqi Cheng

Boston University - Questrom School of Business ( email )

595 Commonwealth Avenue
Boston, MA MA 02215
United States

Chengfeng Mao

Massachusetts Institute of Technology (MIT) ( email )

77 Massachusetts Avenue
50 Memorial Drive
Cambridge, MA 02139-4307
United States

Emaad Manzoor

Cornell University, Ithaca, New York

New York
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
4,895
Abstract Views
19,423
Rank
3,517
PlumX Metrics