Confidence Set for Group Membership
102 Pages Posted: 4 Mar 2018 Last revised: 13 Dec 2018
Date Written: December 12, 2018
We develop new procedures to quantify the statistical uncertainty of data-driven clustering algorithms. In our panel setting, each unit belongs to one of a finite number of latent groups with group-specific regression curves. We propose methods for computing unit-wise and joint confidence sets for group membership. The unit-wise sets give possible group memberships for a given unit and the joint sets give possible vectors of group memberships for all units. We also propose an algorithm that can improve the power of our procedures by detecting units that are easy to classify. The confidence sets invert a test for group membership that is based on a characterization of the true group memberships by a system of moment inequalities. To construct the joint confidence, we solve a high-dimensional testing problem that tests group membership simultaneously for all units. We justify this procedure under N, T → ∞ asymptotics where we allow T to be much smaller than N. As part of our theoretical arguments, we develop new simultaneous anti-concentration inequalities for the MAX and the QLR statistics. Monte Carlo results indicate that our confidence sets have adequate coverage and are informative. We illustrate the practical relevance of our confidence sets in two applications.
Keywords: Panel Data, Grouped Heterogeneity, Clustering, Confidence Set, Machine Learning, Moment Inequalities, Joint One-Sided Tests, Self-Normalized Sums, High-Dimensional CLT, Anti-Concentration for QLR
JEL Classification: C23, C33, C38
Suggested Citation: Suggested Citation