Focused Concept Miner (FCM): Interpretable Deep Learning for Text Exploration

45 Pages Posted: 21 Dec 2018 Last revised: 23 Sep 2020

See all articles by Dokyun Lee

Dokyun Lee

Carnegie Mellon University - David A. Tepper School of Business

Emaad Manzoor

Carnegie Mellon University, Students

Zhaoqi Cheng

Carnegie Mellon University - David A. Tepper School of Business

Date Written: May 20, 2018

Abstract

We introduce the Focused Concept Miner (FCM), an interpretable deep learning text mining algorithm to (1) automatically extract coherent corpus-level concepts from text data, (2) focus the discovery of concepts so that they are highly correlated to the user-specified outcome, and (3) quantify the concept correlational importance to outcome. FCM is used to explore and potentially extract apriori unknown concepts from text that may explain business outcome. FCM is a custom neural network model explicitly configured to increase corpus-level insights and recovered-concept diversity without the need to provide any training data.

We evaluate FCM using a dataset of online purchases containing the reviews read by each consumer. Compared to 4 interpretable baselines, FCM attains higher interpretability as quantified by 2 human-judged metrics and 1 automated metric, and higher recall of unique concepts as supported by several experiments. In addition, FCM extracted constructs relating to product quality theorized to impact conversion in literature, without being explicitly trained to do so. FCM also achieves superior predictive performance compared to 4 interpretable benchmarks while maintaining superior or competitive predictive performance compared to 8 blackbox classifiers. In further experiments, we evaluate FCM on text data from online newsgroups and a crowdfunding platform, investigate the impact of focusing on concept discovery, and study the interpretability-accuracy trade-off. We present FCM as a complimentary technique to explore and understand text data before applying standard causal inference techniques. We conclude by discussing managerial implications, potential business applications, limitations, and ideas for future development.

Keywords: Interpretable Machine Learning, Deep Learning, Text Mining, Automatic Concept Extraction, Coherence, Transparent Algorithm, Augmented Hypothesis Development, XAI

JEL Classification: C38, C39, M31, M39

Suggested Citation

Lee, Dokyun and Manzoor, Emaad and Cheng, Zhaoqi, Focused Concept Miner (FCM): Interpretable Deep Learning for Text Exploration (May 20, 2018). Available at SSRN: https://ssrn.com/abstract=3304756 or http://dx.doi.org/10.2139/ssrn.3304756

Dokyun Lee (Contact Author)

Carnegie Mellon University - David A. Tepper School of Business ( email )

5000 Forbes Avenue
Pittsburgh, PA 15213-3890
United States

Emaad Manzoor

Carnegie Mellon University, Students ( email )

Pittsburgh, PA
United States

Zhaoqi Cheng

Carnegie Mellon University - David A. Tepper School of Business ( email )

Pittsburgh, PA
United States

Here is the Coronavirus
related research on SSRN

Paper statistics

Downloads
4,219
Abstract Views
16,399
rank
2,382
PlumX Metrics