Advancing Qualitative Analysis: An Exploration of the Potential of Generative AI and NLP in Thematic Coding

54 Pages Posted: 26 Jun 2023

See all articles by Yasir Gamieldien

Yasir Gamieldien

Virginia Tech - College of Engineering

Jennifer M. Case

Virginia Tech University

Andrew Katz

Virginia Tech - College of Engineering

Date Written: June 21, 2023

Abstract

Background: Traditional manual coding in qualitative data analysis can be labor-intensive and time-consuming, especially with large data sets. This research investigates the potential use of natural language processing (NLP) techniques and large language models (LLMs), such as GPT-3.5, to enhance efficiency and depth of insights during the qualitative data coding process.

Method: We compared traditional manual thematic analysis with two NLP-assisted approaches, NLP Cluster Assisted (NLPCA) and NLP with GPT-3.5 (NLPGPT), using a dataset of 3,800 student responses on “exam wrappers” from an engineering physics course. Exam wrappers are structured reflection activities that prompt students to practice self-reflection after they get their graded exams back. Agreement between the methods was evaluated based on the similarity of the generated codes.

Results: Both NLPCA and NLPGPT effectively identified similar themes in the student responses, demonstrating a promising alternative to traditional qualitative coding. Notably, the GPT-3.5 model exhibited strength in producing highly granular codes, which could offer deeper and more nuanced insights.

Discussion: The results of the study underscore the significant benefits of integrating NLP and LLMs into qualitative research. While the study identified challenges such as biases in language models, overfitting in terms of overly granular codes, and resource constraints, the findings suggest these hurdles can be addressed with further research and refinement of the methodology. The application of NLP and LLMs across various research contexts needs validation, setting a promising direction for future studies. This research marks an important stepping stone in enhancing traditional qualitative research with AI technology, paving the way for more scalable, robust, and efficient research methodologies.

Keywords: latural language processing, qualitative analysis, ChatGPT, large language models

Suggested Citation

Gamieldien, Yasir and Case, Jennifer M. and Katz, Andrew, Advancing Qualitative Analysis: An Exploration of the Potential of Generative AI and NLP in Thematic Coding (June 21, 2023). Available at SSRN: https://ssrn.com/abstract=4487768 or http://dx.doi.org/10.2139/ssrn.4487768

Yasir Gamieldien (Contact Author)

Virginia Tech - College of Engineering ( email )

Blacksburg, VA
United States

Jennifer M. Case

Virginia Tech University

Andrew Katz

Virginia Tech - College of Engineering ( email )

Blacksburg, VA
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
1,199
Abstract Views
3,069
Rank
36,759
PlumX Metrics