Artificial Intelligence and Machine Learning
,
Next generation technologies and secure development
AI failures can hide in ways that security tests don’t measure

When an AI chatbot tells people to add glue to pizza, the mistake is obvious. When he recommends eating more bananas – sound nutritional advice that could be dangerous for someone with kidney failure – the error is hiding in plain sight.
See also: AI browsers: the new Trojan horse?
This is a risk now poised to reach hundreds of millions of users with little or no regulatory oversight.
OpenAI days ago launched ChatGPT Healthallowing users to connect medical records and wellness apps for personalized health advice. The company said more than 230 million people ask ChatGPT health questions each week, and 40 million daily users seek medical advice (see: ChatGPT Health: Top Privacy, Security, and Governance Concerns).
Google has in partnership with health data platform b.well, suggesting similar products could follow.
Even some AI experts are skeptical. “What’s inherent in models is this next-generation probabilistic thing. Models don’t naturally know when they’re lacking enough information, they’re not inherently calibrated to express uncertainty, and they don’t have an internal notion of what’s ‘good’ or ‘bad,'” said Diana Kelley, chief information security officer at AI security and governance company Noma Security.
“They’re just very good at producing texts that seem plausible and authoritative, even if they’re not,” she told ISMG.
Using chatbots for healthcare purposes amplifies risks due to what Kelley describes as verification asymmetry. “In coding, a crazy answer often fails quickly: the code won’t compile or run,” she said. “In medicine, advice is highly conditional and depends on patient-specific context that the system typically does not have. Healthcare is full of ‘it depends,’ which makes it easier for an AI to appear easy while being contextually wrong, and much harder for users to independently verify the answer.”
Standard AI safety assessments may fail to detect the highest risk outputs. “Most assessments focus on superficial signals such as explicit policy violations, factual errors or crisis keywords, while rewarding mastery and empathy,” said Koustuv Saha, an assistant professor of computer science at the University of Illinois Grainger College of Engineering, whose research includes the behavior of large language models in high-risk healthcare settings. “This means that subtly misleading advice can pass security checks,” he said.
Shannon Germain Farraher, senior healthcare analyst at Forrester, said healthcare organizations demand high levels of accuracy. Medical advice cannot tolerate “consistent nonsense” that is acceptable in less critical areas. Detecting contextually misleading advice requires human oversight through what she calls “detect, surface, and soothe”: identifying subtle issues that AI models overlook, bringing hidden issues to light, and rebuilding trust after AI missteps.
Because language patterns are designed to be consistent and useful at every turn, they tend to reinforce previous information rather than challenge it. “These risks are often implicit and interactional: they arise from what is omitted, from how uncertainty is mitigated,” Saha said.
The risk escalates over conversations in ways that security guardrails cannot address. “The safety instructions are the most important thing the model sees,” said Sowmith Kuppa Venkata Purushottam Naga, an AI and machine learning engineer at Old Dominion University. “But as the conversation goes on, especially if a user shares their fear, pain, or trauma, the model is trained to match the user’s emotional tone and it tries to be a supportive assistant. It prioritizes helping and comforting over strict rule-following.”
According to Naga, the most important technical safeguard is mandatory citation. “We need to code systems that don’t just respond, but that need to highlight the specific medical source that supports the response,” he said. On the non-technical side, he advocates friction between products: showing unclear answers until users recognize the warnings, or highlighting uncertain phrases. “Companies are least willing to implement friction because they want their AI to be magical, instantaneous, and human-like. Adding warnings or forcing the AI to say ‘I don’t know’ reduces user engagement.”
The accountability framework remains unresolved. “Right now, it’s a huge gray area,” Naga said.
The governance gap could create a strategic risk for organizations deploying AI in health. Currently, there is no unified federal law or industry standard governing consumer health chatbots, Farraher said. While the government promotes AI adoption through initiatives such as the American AI Blueprint, these efforts focus on accelerating innovation rather than imposing guardrails. “Consumer health chatbots today operate in a fragmented governance environment with minimal proactive constraints,” he added.
