Insights Commentary Research and Evidence Ethics, Regulation, and Responsible Use

The new clinical skill: Knowing when AI is wrong

Clinicians need to recognize situations where caution should increase.

May 07, 2026 By Deputy Editor: Vishnu Priya Pulipati, MD, FACE, DipABCL 5 min read

Artificial intelligence is already embedded within our clinical workflows. It assists with documentation, summarizes literature, predicts trends, and increasingly, answers clinical questions in real time. As clinicians learn how to use these technologies, much of the discussion has centered on whether AI will replace physicians or diminish clinical expertise. However, a more immediate and important question may be: Do clinicians know when AI is wrong?

That is no longer a theoretical concern; it is a clinical skill gap.

The Illusion of Competence

AI systems are designed to sound confident. They generate polished, articulate responses that can feel remarkably sophisticated. They summarize guidelines, synthesize studies, and provide recommendations with a fluency that may appear indistinguishable from expertise. That is precisely what makes these systems both powerful and potentially dangerous.

In medicine, uncertainty is often a signal that a situation is complex. Real patients rarely fit neatly into algorithms or textbook examples. They present with overlapping diseases, incomplete histories, medication interactions, financial limitations, inconsistent adherence, and evolving clinical pictures.

AI systems frequently smooth over that uncertainty too quickly, offering definitive recommendations in inherently nuanced situations. In doing so, these systems can create an illusion of competence that exceeds their actual reliability.

Accuracy Is Not Reliability

Modern AI systems are becoming increasingly sophisticated. Many are grounded in peer-reviewed medical literature and designed to reduce hallucinations by retrieving published evidence before generating responses. This has substantially improved the quality of AI-generated medical information. However, accuracy alone does not equal clinical reliability.

AI handles information well. Medicine, however, is fundamentally about making decisions under uncertainty. The uncomfortable reality is that AI often performs best where medicine is simplest and may struggle most where clinical judgment matters most.

'If I Still Need to Verify AI, Why Not Just Do It Myself?'

The answer is not simply speed. It is cognitive leverage.

Without AI, clinicians often start from a blank page, recalling guidelines, searching the literature, organizing evidence, and structuring their reasoning step by step. Increasingly, AI allows clinicians to begin with a draft: a synthesized summary, a proposed framework, or a first-pass interpretation of available information.

That draft may be incomplete or occasionally incorrect. Nevertheless, it changes where cognitive effort is spent. The clinician’s role shifts from primarily generating information to critically evaluating it.

The value of AI does not lie in replacing clinical reasoning but in accelerating access to information while allowing clinicians to focus more deeply on interpretation, contextualization, and decision-making.

The Real Risk Is Not Overuse, It Is Uncritical Use

When a tool is fast, accessible, and generally helpful, it naturally lowers skepticism. Clinicians may be less likely to carefully interrogate responses, particularly when those responses are polished, well-structured, and supported by citations.

Yet even evidence-grounded AI systems will prioritize certain studies, emphasize specific findings, and compress complex evidence into simplified conclusions. That process inevitably introduces bias.

A cited answer may still be the wrong answer for the patient sitting in front of the clinician.

A Collaborator, Not a Tool

Vishnu Priya Pulipati, MD, FACE, DipABCL

A more useful analogy may be to view AI as a highly knowledgeable junior colleague: well-read, efficient, and capable of rapidly synthesizing information, yet still lacking lived clinical experience, contextual judgment, and accountability.

In clinical training, physicians do not blindly trust trainees simply because they sound confident or provide polished presentations. They ask follow-up questions, challenge assumptions, identify gaps, and carefully supervise decision-making. The same mindset is increasingly necessary when interacting with AI systems.

Know When to Pause

Clinicians do not need to distrust AI. However, they do need to recognize situations where caution should increase. One helpful shift is to stop asking whether an AI-generated answer is “accurate” and instead ask a different set of questions.

Is this retrieval or interpretation? AI is generally more reliable when summarizing guidelines or retrieving published evidence. Reliability decreases as interpretation increases, particularly when systems begin recommending specific clinical decisions in complex scenarios.
Does this apply to this patient? AI systems operate largely at the population level. Recommendations may fail to account for frailty, chronic kidney disease, ethnicity, polypharmacy, financial barriers, adherence challenges, or patient preferences. High-quality evidence may still represent a poor fit for an individual patient.
Would I stand by this recommendation without AI? This may be the simplest test. If you cannot independently justify the recommendation, you probably should not act on it.

Clinicians should also recognize recurring situations where AI may be particularly vulnerable to error. AI systems may favor highly cited studies over the most clinically applicable evidence, merge conflicting findings into overly simplified narratives, underemphasize uncertainty, or provide recommendations that appear reasonable but fail to account for real-world complexity.

Reframing Expertise in the AI Era

Much of the public conversation surrounding AI in medicine has focused on whether physicians will eventually be replaced or become less skilled. In reality, the outcome will likely depend on how these technologies are integrated into clinical practice.

If AI becomes a shortcut that replaces independent thinking, it may erode clinical reasoning over time. However, if AI functions as a collaborator that enhances critical thinking and improves efficiency, it has the potential to strengthen care.

Clinical excellence will no longer be defined primarily by who can retrieve information the fastest. AI already performs that task extraordinarily well. Increasingly, the differentiator will be judgment: recognizing when evidence does not appropriately apply, identifying missing context, challenging conclusions, and navigating uncertainty responsibly.

In an era where answers are increasingly abundant, judgment may become the defining clinical skill. Recognizing when AI is wrong could ultimately become one of the most important competencies in modern medicine.

AACE Endocrine AI is published by Conexiant under a license arrangement with the American Association of Clinical Endocrinology, Inc. (AACE^®). The ideas and opinions expressed in AACE Endocrine AI do not necessarily reflect those of Conexiant or AACE. For more information, see Policies.