Comparing Response Performances of Chatgpt-3.5, Chatgpt-4 and Bard to Health-Related Questions: Comprehensiveness, Accuracy and Being Up-to-Date
8 Pages Posted: 18 Jul 2023
Date Written: July 7, 2023
Abstract
Assessing comparative performance of ChatGPT-3.5, ChatGPT-4 and Bard in answering health related questions for comprehensiveness, accuracy and actuality (being up to date) domains was the purpose of this study. Family Physicians scored the responses of three chatbots to five questions with a five-point scale. For comprehensiveness domain ChatGPT-4 had better mean score than Bard and Bard had better mean score than ChatGPT-3.5. For accuracy domain statistically significant difference was not observed between three chatbots. For actuality domain Bard had better mean score than ChatGPT-4 and ChatGPT-4 had better mean score than ChatGPT-3.5. Each of the chatbots were assessed as accurate, up-to-date and adequately comprehensive.
Note:
Funding Information: None.
Conflict of Interests: None.
Keywords: ChatGPT-4, Bard, performance, assessment
JEL Classification: I12
Suggested Citation: Suggested Citation