64 Pages Posted: 4 Nov 2007 Last revised: 19 Aug 2008
This article discusses methods for automatic annotation of political texts for semantic fields - groups of words with related meanings. This type of annotation is useful when studying the structure of political ideology, from agendas to rhetoric. We argue that it is important to exclude the researcher from the construction of analytic categories, due to her own susceptibility to the very phenomenon under analysis. In this vein, three types of automatic annotation are presented - unsupervised clustering, dictionary-based approaches, and a method based on relevant experimental data. All methods are applied to analyzing Margaret Thatcher's political rhetoric. We find that unsupervised clustering is most useful for tracing agendas/topics; dictionary-based methods are most effective in a comparative setting, whereas the last method is the most promising for detecting off-topic, singular uses of semantic domains, which are often rhetorical tools used to achieve a political end. Validity, applicability, strengths and weaknesses of each method and of their combinations are addressed in detail.
Keywords: ideology, speech, rhetoric, topic, framing, clustering
Suggested Citation: Suggested Citation
Beigman Klebanov, Beata and Diermeier, Daniel and Beigman, Eyal, Automatic Annotation of Semantic Fields for Political Science Research. Available at SSRN: https://ssrn.com/abstract=1026961 or http://dx.doi.org/10.2139/ssrn.1026961