From Lexicons to Large Language Models: A Holistic Evaluation of Psychometric Text Analysis in Social Science Research

49 Pages Posted: 24 Apr 2024 Last revised: 9 Apr 2025

See all articles by Reza Mousavi

Reza Mousavi

University of Virginia

Brent Kitchens

University of Virginia - McIntire School of Commerce

Abbie Oliver

Georgia State University

Ahmed Abbasi

University of Notre Dame - Mendoza College of Business - IT, Analytics, and Operations Department

Date Written: March 28, 2024

Abstract

Extracting psychological constructs from text is critical for social science researchers studying attitudes, perceptions, and behaviors across online platforms and other forms of written and spoken communication. In this study, we perform a holistic evaluation of four major paradigms for extracting psychological constructs from text, addressing a breadth of relevant dimensions of performance, as well as in-depth analysis informed by dual processing theory. We demonstrate that Large Language Models (LLMs) achieve comparable or superior performance to traditional methods, exhibiting high predictive accuracy, consistency across diverse text samples, and fairness, while reducing or eliminating need for domain or NLP expertise and costly manual annotations. From the perspective of dual processing theory, we further investigate how human annotations are influenced by the alignment between individuals’ cognitive and affective abilities and the psychological constructs being extracted. Most paradigms for measuring psychological constructs are supervised methods relying heavily on large quantities of human-labeled data, but dependence on human annotators may introduce noise in these underlying datasets.

Our findings reveal that individuals with higher cognitive abilities excel at extracting constructs related to cognitive processes, while those with higher emotional intelligence are more effective at extracting constructs linked to affective processes, and building models with data from misaligned annotators may send them awry. Inspired by these findings, we introduce a cognitive-affective LLM prompting strategy to emulate human-like emphasis on cognitive abilities and emotional intelligence, demonstrating that this approach enhances predictive performance beyond existing state-of-the-art prompting strategies. Salient design insights from our novel framework provide practical guidance for method selection and deployment, as well as support for emerging computationally intensive theory-building research and design of psychometric NLP-based artifacts, underscoring the promise of LLMs to enhance measurement reliability and uncover deeper psychological dynamics from digital text.

Keywords: natural language processing (NLP), data annotation, data labeling, transformers, large language models (LLMs), ChatGPT, psychometrics, generative AI

Suggested Citation

Mousavi, Reza and Kitchens, Brent and Oliver, Abbie and Abbasi, Ahmed,

From Lexicons to Large Language Models: A Holistic Evaluation of Psychometric Text Analysis in Social Science Research

(March 28, 2024). Available at SSRN: https://ssrn.com/abstract=4776480 or http://dx.doi.org/10.2139/ssrn.4776480

Reza Mousavi (Contact Author)

University of Virginia ( email )

1400 University Ave
Charlottesville, VA 22903
United States

Brent Kitchens

University of Virginia - McIntire School of Commerce ( email )

P.O. Box 400173
Charlottesville, VA 22904-4173
United States

Abbie Oliver

Georgia State University

35 Broad Street
Atlanta, GA 30303-3083
United States

Ahmed Abbasi

University of Notre Dame - Mendoza College of Business - IT, Analytics, and Operations Department

Notre Dame, IN 46556
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
224
Abstract Views
973
Rank
292,569
PlumX Metrics