Strategies of Validation: Assessing the Varieties of Democracy Corruption Data
45 Pages Posted: 4 Feb 2016 Last revised: 24 May 2016
Date Written: February 1, 2016
Social scientists face the challenge of determining whether their data are valid, yet they lack practical guidance about how to do so. Existing publications on data validation provide mostly abstract information for creating one’s own dataset or establishing that an existing one is adequate. Further, they tend to pit validation techniques against each other, rather than explain how to combine multiple approaches. By contrast, this paper provides a practical guide to data validation in which tools are used in a complementary fashion to identify the strengths and weaknesses of a dataset and thus reveal how it can most effectively be used. We advocate for three approaches, each incorporating multiple tools: 1) assessing content validity through an examination of the resonance, domain, differentiation, fecundity, and consistency of the measure; 2) evaluating data generation validity through an investigation of dataset management structure, data sources, coding procedures, aggregation methods, and geographic and temporal coverage; and 3) assessing convergent validity using case studies and empirical comparisons among coders and among measures. We apply our method to corruption measures from a new dataset, Varieties of Democracy. We show that the data are generally valid and we emphasize that a particular strength of the dataset is its capacity for analysis across countries and over time. These corruption measures represent a significant contribution to the field because, although research questions have focused on geographic differences and temporal trends, other corruption datasets have not been designed for this type of analysis.
Suggested Citation: Suggested Citation