Skip to main content

ChatGPT Generates Differential Diagnoses With Similar Accuracy to Emergency Doctors

Medically reviewed by Drugs.com.

By Lori Solomon HealthDay Reporter

THURSDAY, Sept. 21, 2023 -- ChatGPT performance in generating differential diagnoses appears to be similar to emergency department medical experts, according to a research letter published online Sept. 9 in the Annals of Emergency Medicine to coincide with the annual European Emergency Medicine Congress, held from Sept. 17 to 20 in Barcelona, Spain.

Hidde ten Berg, from Jeroen Bosch Hospital in Utrecht, Netherlands, and colleagues investigated the ability of ChatGPT to generate accurate differential diagnoses based on physician notes recorded at initial emergency department presentation. The analysis included a retrospective analysis of 30 undifferentiated patients presenting to a nonacademic teaching hospital in March 2022 with a single proven diagnosis. ChatGPT results were compared to clinical teams' first formulated differential diagnoses and leading diagnoses without laboratory tests.

The researchers found that physicians correctly included the diagnosis in the top five differential diagnoses for 83 percent of cases, similar to ChatGPT v3.5 (77 percent) and v4.0 (87 percent). When including laboratory data, physicians' accuracy increased to 87 percent and ChatGPT v3.5 accuracy increased to 97 percent, while v4.0 accuracy remained at 87 percent. Physicians outperformed ChatGPT for choosing the correct leading diagnosis (60 versus 37 percent for v3.5 and 53 percent for v4.0). These values changed to 53 percent for physicians with laboratory data and 60 percent for v3.5 and 53 percent for v4.0. Differential diagnoses of physicians and ChatGPT overlapped by 60 percent. However, the researchers noted that ChatGPT can also generate varied responses to the same query.

"This observed inconsistency in ChatGPT's outputs emphasizes the inherent unpredictability in large language models and underscores the fact that these are merely tools that can aid, but not replace physicians' judgment," the authors write.

Abstract/Full Text (subscription or payment may be required)

More Information

Disclaimer: Statistical data in medical articles provide general trends and do not pertain to individuals. Individual factors can vary greatly. Always seek personalized medical advice for individual healthcare decisions.

© 2024 HealthDay. All rights reserved.

Read this next

Activity Tracker, Scale Plus Phone App May Aid Weight Loss

THURSDAY, May 16, 2024 -- Weight loss is similar for individuals using a wireless feedback system (WFS) that provides daily information on lifestyle change and weight loss versus...

Male, Female V1421 Carriers Face Similar Risk for Heart Failure Hospitalization

THURSDAY, May 16, 2024 -- Male and female V1421 carriers face a similar and substantial risk for heart failure hospitalization, according to a study published online May 12 in the...

Radiomics Features Can Identify Destabilizing Meniscal Tears

THURSDAY, May 16, 2024 -- Radiomics features can help identify incident destabilizing meniscal tears, according to a study published online May 15 in the Journal of Orthopaedic...

More news resources

Subscribe to our newsletter

Whatever your topic of interest, subscribe to our newsletters to get the best of Drugs.com in your inbox.