ChatGPT Scores Close to Passing U.S. Medical Licensing Exams

ChatGPT performs at or near the passing threshold of 60 percent accuracy without specialized training

By Elana Gotkine HealthDay Reporter

MONDAY, Feb. 13, 2023 (HealthDay News) — A new artificial intelligence system, ChatGPT, scores at or around the passing threshold for the U.S. Medical Licensing Exam, according to a study published online Feb. 9 in PLOS Digital Health.

Tiffany H. Kung, from AnsibleHealth Inc. Mountain View in California, and colleagues examined the performance of the language model ChatGPT on the U.S. Medical Licensing Exam, which consists of Step 1, Step 2CK, and Step 3 exams.

The researchers found that ChatGPT performed at or near the passing threshold of 60 percent accuracy, without any specialized training or reinforcement. Across all questions, ChatGPT outputted answers and explanations with 94.6 percent concordance; the high concordance was sustained across all exams. ChatGPT produced at least one significant insight in 88.9 percent of responses.

“We believe that large language models such as ChatGPT are reaching a maturity level that will soon impact clinical medicine at large, enhancing the delivery of individualized, compassionate, and scalable health care,” the authors write.

Abstract/Full Text