A Conditional Gradient Approach for Nonparametric Estimation of Mixing Distributions
65 Pages Posted: 1 Sep 2017 Last revised: 15 Dec 2020
Date Written: October 20, 2018
Mixture models are versatile tools that are used extensively in many fields, including operations, marketing, and econometrics. The main challenge in estimating mixture models is that the mixing distribution is often unknown and imposing apriori parametric assumptions can lead to model misspecification issues. In this paper, we propose a new methodology for nonparametric estimation of the mixing distribution of a mixture of logit models. We formulate the likelihood-based estimation problem as a constrained convex program and apply the conditional gradient (a.k.a. Frank-Wolfe) algorithm to solve this convex program. We show that our method iteratively generates the support of the mixing distribution and the mixing proportions. Theoretically, we establish sublinear convergence rate of our estimator and characterize the structure of the recovered mixing distribution. Empirically, we test our approach on real-world datasets. We show that it outperforms the standard expectation-maximization (EM) benchmark on speed (16x faster), in-sample fit (up to 24% reduction in the log-likelihood loss), and predictive (average 27% reduction in standard error metrics) and decision accuracies (extracts around 23% more revenue). On synthetic data, we show that our estimator is robust to different ground-truth mixing distributions and can also account for endogeneity.
Keywords: nonparametric estimation, mixtures, conditional gradient, consideration sets
Suggested Citation: Suggested Citation