Validating Self-Reported Turnout by Linking Public Opinion Surveys with Administrative Records
44 Pages Posted: 16 Aug 2018 Last revised: 18 Aug 2018
Date Written: July 22, 2018
Although it is widely known that the self-reported turnout rates obtained from public opinion surveys tend to substantially over-estimate the actual turnout rates, scholars sharply disagree on what causes this bias. Some blame misreporting due to social desirability, others attribute it to non-response bias and the accuracy of turnout validation. While we can validate self-reported turnout by directly linking surveys with administrative records, most existing studies rely on proprietary merging algorithms with limited scientific transparency and yield conflicting results. To shed a light on this debate, we apply a canonical probabilistic record linkage model, implemented via the open-source software package fastLink, to merge two major election studies the American National Election Studies and the Cooperative Congressional Election Survey with a national voter fille of over 180 million records. For both studies, fastLink successfully produces validated turnout rates close to the actual turnout rates, leading to public-use validated turnout data for the two studies. Using these merged data sets, we find that the bias of self-reported turnout originates primarily from misreporting rather than non-response. Our findings suggest that those who are educated and interested in politics are more likely to over-report turnout. Finally, we show that fastLink performs as well as a proprietary algorithm.
Suggested Citation: Suggested Citation