News Research Ethics, Regulation, and Responsible Use Research and Evidence

Study: AI misinformation 'significantly' lowers diagnostic accuracy in novice medical students

“This finding highlights a fundamental safety challenge for AI in medical education.”

April 08, 2026 By Matthew Solan 3 min read

Misleading artificial intelligence (AI)–generated explanations significantly reduced diagnostic accuracy among junior medical students, while correct explanations provided no significant improvement compared with no explanation, according to a randomized trial published in npj Digital Medicine.

“This study provides crucial empirical evidence that, without proper safeguards, the harm caused by AI-generated falsehoods in this population and task is more potent and robust than the benefit derived from correct guidance,” wrote lead researcher Da Teng of the Beijing Institute of Petrochemical Technology in Beijing, China, and colleagues. “This finding highlights a fundamental safety challenge for AI in medical education, demanding a strategic pivot towards building learners’ critical appraisal skills.”

In a 3-arm randomized controlled trial, investigators assigned 111 medical students (n = 37 per group) to receive no explanation, correct artificial intelligence-generated explanations, or misleading AI-generated explanations while answering 25 US Medical Licensing Examination-style multiple-choice questions. The students had completed foundational coursework but had not yet begun major clinical rotations.

The study used standardized explanations of similar length and format, and deliberately constructed misleading explanations that were plausible yet incorrect, often incorporating accurate premises to support flawed conclusions.

Diagnostic accuracy was 21% in the control group, 23% in the correct explanation group, and 9% in the misleading explanation group. Compared with no explanation, misleading explanations were associated with approximately 0.36 times the odds of a correct response.

Confidence ratings increased with any AI-generated explanation, regardless of correctness. Mean confidence scores were higher in both explanation groups compared with the control group (1.9 for correct and 1.92 for misleading vs 1.42 on a 3-point scale).

This effect persisted across both correct and incorrect responses. Students who received explanations reported higher confidence even when their answers were wrong, indicating that confidence increased in both explanation groups regardless of explanation accuracy.

Calibration also differed by condition. In the correct explanation group, confidence was higher for correct vs incorrect answers. In contrast, the misleading explanation group showed similar confidence for correct and incorrect answers, indicating a loss of alignment between confidence and diagnostic accuracy.

Error pattern analysis showed that misleading explanations actively steered responses. Among incorrect answers in the misleading group, 70% were “commission errors,” defined as selecting the specific incorrect option promoted by the AI explanation, compared with 25% in the control group and 30% in the correct explanation group.

Item-level analyses demonstrated variability in susceptibility to misinformation. For highly persuasive items, 82% of students selected the targeted incorrect answer vs 18% in the control group. For less persuasive items, the difference was smaller (42% vs 23%, respectively), illustrating the variability in the persuasive effect of misleading explanations.

Researchers noted several limitations. The study used fully correct vs fully incorrect explanations to maximize effect detection, which may overestimate real-world impact where AI outputs are more nuanced. The sample also included only junior medical students from a single geographic context, limiting generalizability to more experienced learners or other settings.

The study was supported by multiple Chinese national and regional research grants. The researchers reported no conflicts of interest.

(Editor's note: The authors provided an unedited version of the study for early access to its findings and noted that the manuscript will undergo further editing before final publishing. They added that there may be errors, which may affect content, and that all legal disclaimers apply.)

AACE Endocrine AI is published by Conexiant under a license arrangement with the American Association of Clinical Endocrinology, Inc. (AACE^®). The ideas and opinions expressed in AACE Endocrine AI do not necessarily reflect those of Conexiant or AACE. For more information, see Policies.

Study: AI misinformation 'significantly' lowers diagnostic accuracy in novice medical students

Related Content