Conditional Topic Allocations for Open-Ended Survey Responses

49 Pages Posted: 22 Aug 2022 Last revised: 3 Apr 2024

See all articles by Tobias Wekhof

Tobias Wekhof

ETH Zürich - CER-ETH - Center of Economic Research at ETH Zurich

Date Written: April 1, 2024

Abstract

Researchers in social sciences are increasingly using surveys that require written responses from participants. Because of the small sample size and short answers, it is challenging to identify topics in the responses with Natural Language Processing (NLP). However, surveys allow collecting additional observable variables about respondents, which can help analyze the text. Here we introduce a data-driven method called "Conditional Topic Allocation" (CTA) for identifying latent topics in text data by conditioning on observables. Researchers can utilize CTA to extract topics from open-ended text answers that explain observable variables. CTA proves to be particularly valuable when analyzing small-scale text data, such as open-ended survey responses. We apply this new approach to two survey experiments and one classical survey and identify topics by conditioning the responses to either priming treatments or political affiliation.

Keywords: topic model, natural language processing, open-ended questions, surveys

JEL Classification: C83, C90

Suggested Citation

Wekhof, Tobias, Conditional Topic Allocations for Open-Ended Survey Responses (April 1, 2024). Available at SSRN: https://ssrn.com/abstract=4190308 or http://dx.doi.org/10.2139/ssrn.4190308

Tobias Wekhof (Contact Author)

ETH Zürich - CER-ETH - Center of Economic Research at ETH Zurich ( email )

Zürichbergstrasse 18
Zurich, 8092
Switzerland
+41 44 633 80 78 (Phone)

HOME PAGE: http://sites.google.com/view/tobiaswekhof

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
89
Abstract Views
779
Rank
528,160
PlumX Metrics