A new study from researchers at the Icahn School of Medicine at Mount Sinai has found that popular AI chatbots are highly vulnerable to repeating and expanding upon false medical information. The findings, published on 2 August in Communications Medicine, point to an urgent need for stronger safeguards before these tools can be safely relied upon in health care. Encouragingly, the research also demonstrated that a simple built-in warning prompt can substantially reduce the spread of errors, offering a practical path forward as AI systems continue to evolve.
To test the reliability of chatbots in medical contexts, the investigators designed fictional patient scenarios, each containing one invented element such as a non-existent disease, symptom, or test. In the first phase of the study, the chatbots reviewed these cases without additional guidance. They not only repeated the false details but often elaborated upon them, confidently producing explanations and treatments for conditions that do not exist. This behaviour, the team concluded, exposes a critical blind spot in how current systems handle misinformation.
Lead author Mahmud Omar, MD, an independent consultant working with the research group, explained the risk: “What we saw across the board is that AI chatbots can be easily misled by false medical details, whether those errors are intentional or accidental. They not only repeated the misinformation but often expanded on it, offering confident explanations for non-existent conditions. The encouraging part is that a simple, one-line warning added to the prompt cut those hallucinations dramatically, showing that small safeguards can make a big difference.”
When the researchers added that one-line safeguard—reminding the chatbot that the information it was given might be inaccurate—the results shifted markedly. Errors were reduced nearly in half, underscoring how modest interventions in prompt design can make the tools significantly safer. Co-senior author Eyal Klang, MD, Chief of Generative AI in the Windreich Department of Artificial Intelligence and Human Health at Mount Sinai, noted: “Even a single made-up term could trigger a detailed, decisive response based entirely on fiction. But the simple, well-timed safety reminder made an important difference, cutting those errors dramatically. That tells us these tools can be made safer, but only if we take prompt design and safeguards seriously.”
The team now plans to extend their “fake-term” method to real, de-identified patient records and to test more advanced safety prompts and retrieval tools. They hope their approach will give hospitals, developers, and regulators a straightforward way to stress-test AI systems before deploying them in clinical practice. By exposing vulnerabilities in a controlled environment, the method could play a critical role in shaping safer applications of AI in medicine.
Reflecting on the broader implications, co-senior author Girish N. Nadkarni, MD, MPH, Chair of the Windreich Department of Artificial Intelligence and Human Health and Chief AI Officer for the Mount Sinai Health System, emphasised that the solution is not to abandon AI but to improve it. “A single misleading phrase can prompt a confident yet entirely wrong answer,” he said. “The solution isn’t to abandon AI in medicine, but to engineer tools that can spot dubious input, respond with caution, and ensure human oversight remains central. We’re not there yet, but with deliberate safety measures, it’s an achievable goal.”
More information: Mahmud Omar et al, Multi-model assurance analysis showing large language models are highly vulnerable to adversarial hallucination attacks during clinical decision support, Communications Medicine. DOI: 10.1038/s43856-025-01021-3
Journal information: Communications Medicine Provided by The Mount Sinai Hospital / Mount Sinai School of Medicine
