Download this Paper Open PDF in Browser

An Automated Snowball Census of the Political Web

34 Pages Posted: 9 May 2011 Last revised: 19 Aug 2014

Abe Gong

University of Michigan at Ann Arbor - Gerald R. Ford School of Public Policy

Date Written: August 9, 2011

Abstract

This paper solves a persistent methodological problem for social scientists studying the political web: representative sampling. Virtually all existing studies of the political web are based on incomplete samples, and therefore lack generalizability. In this paper, I combine methods from computer science and sampling theory to conduct an automated snowball census of the political web and constructs an all-but-complete index of English political websites. I check the robustness of this index, use it to generate descriptive statistics for the entire political web, and demonstrate that studies based on ad hoc sampling strategies are likely to be biased in important ways. In future research, this bias can be eliminated by using this index as a sampling universe. In addition, the methods and open-source software presented here can be used to creating similar sampling frames for other online content domains.

Keywords: sampling theory, web mining, text classification, computational social science

Suggested Citation

Gong, Abe, An Automated Snowball Census of the Political Web (August 9, 2011). Available at SSRN: https://ssrn.com/abstract=1832024 or http://dx.doi.org/10.2139/ssrn.1832024

Abe Gong (Contact Author)

University of Michigan at Ann Arbor - Gerald R. Ford School of Public Policy ( email )

735 South State Street, Weill Hall
Ann Arbor, MI 48109
United States

Paper statistics

Downloads
143
Rank
171,431
Abstract Views
1,218