Visualizing The Implicit Model Selection Tradeoff
49 Pages Posted: 25 Oct 2021 Last revised: 9 Mar 2022
Date Written: October 21, 2021
Abstract
The recent rise of Machine Learning (ML) has been leveraged by operations management researchers to provide new solutions to operational problems. As other ML applications, these solutions rely on model selection, which is typically done by evaluating certain metrics on models separately and selecting the model whose evaluations (accuracy-related loss and/or some interpretability measures) are optimal. However, empirical evidence suggests that often in practice multiple models attain competitive results. Therefore, while models' overall performance could be similar, the models could operate quite differently. This results in an implicit tradeoff in models' performance throughout the feature space whom resolving requires new model selection tools.
This paper aims to study methods for comparing predictive models in an interpretable manner in order to expose this tradeoff, which characterizes model selection problems. To this end, we propose various methods that synthesize ideas from supervised learning, unsupervised learning, dimensionality reduction, and visualization, and demonstrate how they can be used to inform the model selection process. Using various datasets and a simple Python interface that we developed, we demonstrate how practitioners and researchers could benefit from applying these approaches to better understand the broader impact of their model selection choices.
Keywords: Model selection, Interpretability, Machine learning, Visualization
Suggested Citation: Suggested Citation