GPT-3.5 Hallucinates Nonexistent Citations: Evidence from Economics
43 Pages Posted: 4 Jun 2023 Last revised: 6 Jun 2023
Date Written: June 3, 2023
We create a set of prompts from every Journal of Economic Literature (JEL) topic to test the ability of a GPT-3.5 large language model (LLM) to write about economic concepts. For general summaries, ChatGPT can perform well. However, more than 30% of the citations suggested by ChatGPT do not exist. Furthermore, we demonstrate that the ability of the LLM to deliver accurate information declines as the question becomes more specific. This paper provides evidence that, although GPT has become a useful input to research production, fact-checking the output remains important.
Keywords: artificial intelligence, large language models, ChatGPT, writing, research methods
JEL Classification: B4, O33, I2
Suggested Citation: Suggested Citation