A Crowd Content Analysis Assembly Line: Scaling Up Hand-Coding with Text Units of Analysis
66 Pages Posted: 13 Jul 2016 Last revised: 1 Jul 2017
Date Written: July 12, 2016
Manual content analysis, also known as hand-coding or annotation, is often the only way to reliably identify important social phenomena in textual data. However, it is extremely time consuming, often requiring large teams of trained undergraduate research assistants working over several years. This article presents a new approach to manual content analysis (and software) enabling researchers to enlist the help of untrained citizen scientists and crowd workers to label text through the internet. After first describing the dilemma of large-scale text analysis, the article explains how hand-coding for text units of analysis (TUAs) allows manual text analysis projects to be decomposed into micro-tasks fit for untrained crowd workers. The approach and software are explained in detail. Then, the article compares the new approach to similar fully-manual or fully-automated approaches, finding that it is less costly and four times faster than traditional manual content analysis, while producing richer and more transparent data than other approaches. The article then outlines a number of projects that might benefit from crowd content analysis, demonstrating its generalizability to an array of social science fields. Finally, it closes with a discussion of the specific limitations and general promise of this new approach.
Keywords: text analysis, natural language processing, crowd work, crowdsourcing, qualitative, quantitative, annotation
Suggested Citation: Suggested Citation