Focused Concept Miner (FCM): Interpretable Deep Learning for Text Exploration

52 Pages Posted: 21 Dec 2018 Last revised: 29 Jun 2022

See all articles by Dokyun Lee

Dokyun Lee

Boston University - Questrom School of Business

Emaad Manzoor

Carnegie Mellon University, Students

Zhaoqi Cheng

Boston University - Questrom School of Business

Date Written: May 20, 2018

Abstract

Focused Concept Miner (FCM) is an interpretable deep learning text mining algorithm to (1) automatically extract coherent corpus-level concepts from text data, (2) focus the discovery of concepts to filter through only the concepts highly correlated to the user-specified outcome, and (3) quantify the concept correlational importance to the outcome. FCM is used to explore and potentially extract apriori unknown concepts from the text that may explain the business outcome, without the need to provide any human-predefined labels on concepts. FCM is explicitly configured to increase recovered-concept diversity, concept relevance, and sparsity for enhanced concept coherence and corpus-level insights. FCM embeds words, documents, and concepts all in the same vector space, enabling easy interpretation of discovered concepts by associating words local to the concept vector.

We evaluate FCM using a dataset of online purchase journey data containing the reviews read by each consumer. Compared to 6 interpretable baselines, FCM attains higher interpretability as quantified by 2 human-judged metrics and 1 automated metric, and higher recall of unique concepts as supported by several experiments. In addition, FCM extracted constructs relating to product quality theorized to impact conversion in literature, without being explicitly trained to do so, by getting focused by an outcome variable (conversion). FCM also achieves superior predictive performance compared to 6 interpretable benchmarks while maintaining superior or competitive predictive performance compared to prediction-focused blackbox classifiers. In further experiments, we evaluate FCM on text data from online newsgroups and a crowdfunding platform, investigate the impact of focusing on concept discovery, and study the interpretability-accuracy trade-off. We present FCM as a complementary technique to explore and understand text data before applying standard causal inference techniques. We conclude by discussing managerial implications, potential business applications, limitations, and ideas for future development.

Keywords: Interpretable Machine Learning, Deep Learning, Text Mining, Automatic Concept Extraction, Coherence, Transparent Algorithm, Augmented Hypothesis Development, XAI

JEL Classification: C38, C39, M31, M39

Suggested Citation

Lee, Dokyun and Manzoor, Emaad and Cheng, Zhaoqi, Focused Concept Miner (FCM): Interpretable Deep Learning for Text Exploration (May 20, 2018). Available at SSRN: https://ssrn.com/abstract=3304756 or http://dx.doi.org/10.2139/ssrn.3304756

Dokyun Lee (Contact Author)

Boston University - Questrom School of Business ( email )

595 Commonwealth Avenue
Boston, MA MA 02215
United States

Emaad Manzoor

Carnegie Mellon University, Students ( email )

Pittsburgh, PA
United States

Zhaoqi Cheng

Boston University - Questrom School of Business ( email )

595 Commonwealth Avenue
Boston, MA MA 02215
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
4,649
Abstract Views
18,247
rank
2,853
PlumX Metrics