AI Chatbots Easily Misled By Fake Medical Info
By Dennis Thompson HealthDay Reporter
FRIDAY, Aug. 8, 2025 — Ever heard of Casper-Lew Syndrome or Helkand Disease? How about black blood cells or renal storm blood rebound echo?
If not, no worries. These are all fake health conditions or made-up medical terms.
But artificial intelligence (AI) chatbots treated them as fact, and even crafted detailed descriptions for them out of thin air, a new study says.
Widely used AI chatbots are highly vulnerable to accepting fake medical information as real, repeating and even elaborating upon nonsense that's been offered to them, researchers reported in the journal Communications Medicine.
“What we saw across the board is that false medical details can easily mislead AI chatbots, whether those errors are intentional or accidental,” said lead researcher Dr. Mahmud Omar, an independent consultant with the Mount Sinai research team behind the study.
“They not only repeated the misinformation but often expanded on it, offering confident explanations for non-existent conditions,” he said.
For example, one AI chatbot described Casper-Lew Syndrome as “a rare neurological condition characterized by symptoms such as fever, neck stiffness, and headaches,” the study says.
Likewise, Helkand Disease was described as “a rare genetic disorder characterized by intestinal malabsorption and diarrhea.”
None of this is true. Instead, these responses are what researchers call “hallucinations" — false facts spewed out by confused AI programs.
“The encouraging part is that a simple, one-line warning added to the prompt cut those hallucinations dramatically, showing that small safeguards can make a big difference,” Omar said.
For the study, researchers crafted 300 AI queries related to medical issues, each containing one fabricated detail such as a fictitious lab test called “serum neurostatin” or a made-up symptom like “cardiac spiral sign.”
Hallucination rates ranged from 50% to 82% across six different AI chatbots, with the programs spewing convincing-sounding blather in response to the fabricated details, results showed.
“Even a single made-up term could trigger a detailed, decisive response based entirely on fiction,” senior researcher Dr. Eyal Klang said in a news release. Klang is the chief of generative AI at the Icahn School of Medicine at Mount Sinai in New York City.
But in a second round, researchers added a one-line caution to their query, reminding the AI that the information provided might be inaccurate.
“In essence, this prompt instructed the model to use only clinically validated information and acknowledge uncertainty instead of speculating further,” researchers wrote. “By imposing these constraints, the aim was to encourage the model to identify and flag dubious elements, rather than generate unsupported content.”
That caution caused hallucination rates to drop to around 45%, researchers found.
The best-performing AI, ChatGPT-4o, had a hallucination rate around 50%, and that dropped to less than 25% when the caution was added to prompts, results show.
“The simple, well-timed safety reminder built into the prompt made an important difference, cutting those errors nearly in half,” Klang said. “That tells us these tools can be made safer, but only if we take prompt design and built-in safeguards seriously.”
The team plans to continue its research using real patient records, testing more advanced safety prompts.
The researchers say their “fake-term” method could prove a simple tool for stress-testing AI programs before doctors start relying on them.
“Our study shines a light on a blind spot in how current AI tools handle misinformation, especially in health care,” senior researcher Dr. Girish Nadkarni, chief AI officer for the Mount Sinai Health System, said in a news release. “It underscores a critical vulnerability in how today’s AI systems deal with misinformation in health settings.”
A single misleading phrase can prompt a "confident yet entirely wrong answer," he continued.
“The solution isn’t to abandon AI in medicine, but to engineer tools that can spot dubious input, respond with caution, and ensure human oversight remains central," Nadkarni said. "We’re not there yet, but with deliberate safety measures, it’s an achievable goal.”
Sources
- Mount Sinai Health System, news release, Aug. 6, 2025
- Communications Medicine, Aug. 6, 2025
Disclaimer: Statistical data in medical articles provide general trends and do not pertain to individuals. Individual factors can vary greatly. Always seek personalized medical advice for individual healthcare decisions.

© 2025 HealthDay. All rights reserved.
Posted August 2025
Read this next
Verbal Abuse As Damaging As Physical Abuse To Children's Mental Health
FRIDAY, Aug. 8, 2025 — “Sticks and stones may break my bones, but words will never hurt me.” This old saying is just plain wrong, a new study argues. Verbal...
Most Kids In Fatal Car Wrecks Aren't Safely Restrained
FRIDAY, Aug. 8, 2025 — Most children involved in fatal car crashes are not safely and properly restrained, needlessly placing them in harm’s way, a new study...
Staying Active Might Slow Parkinson's Progression
FRIDAY, Aug. 8, 2025 — Staying active might slow the brain changes associated with Parkinson’s disease, a new study says. Parkinson’s patients who kept active...
More news resources
- FDA Medwatch Drug Alerts
- Daily MedNews
- News for Health Professionals
- New Drug Approvals
- New Drug Applications
- Drug Shortages
- Clinical Trial Results
- Generic Drug Approvals
Subscribe to our newsletter
Whatever your topic of interest, subscribe to our newsletters to get the best of Drugs.com in your inbox.