Comparing Response Performances of Chatgpt-3.5, Chatgpt-4 and Bard to Health-Related Questions: Comprehensiveness, Accuracy and Being Up-to-Date

8 Pages Posted: 18 Jul 2023

Date Written: July 7, 2023

Abstract

Assessing comparative performance of ChatGPT-3.5, ChatGPT-4 and Bard in answering health related questions for comprehensiveness, accuracy and actuality (being up to date) domains was the purpose of this study. Family Physicians scored the responses of three chatbots to five questions with a five-point scale. For comprehensiveness domain ChatGPT-4 had better mean score than Bard and Bard had better mean score than ChatGPT-3.5. For accuracy domain statistically significant difference was not observed between three chatbots. For actuality domain Bard had better mean score than ChatGPT-4 and ChatGPT-4 had better mean score than ChatGPT-3.5. Each of the chatbots were assessed as accurate, up-to-date and adequately comprehensive.

Note:
Funding Information: None.

Conflict of Interests: None.

Keywords: ChatGPT-4, Bard, performance, assessment

JEL Classification: I12

Suggested Citation

Yıldız, Mustafa Said, Comparing Response Performances of Chatgpt-3.5, Chatgpt-4 and Bard to Health-Related Questions: Comprehensiveness, Accuracy and Being Up-to-Date (July 7, 2023). Available at SSRN: https://ssrn.com/abstract=4503443 or http://dx.doi.org/10.2139/ssrn.4503443

Mustafa Said Yıldız (Contact Author)

Republic of Türkiye Ministry of Health ( email )

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
180
Abstract Views
566
Rank
332,049
PlumX Metrics