Activity

Activity ID

14575

Expires

December 26, 2028

Format Type

Journal-based

CME Credit

1

Fee

$30

CME Provider: JAMA Network Open

Description of CME Course

Importance  Large language models (LLMs) are increasingly integrated into health care applications; however, their vulnerability to prompt-injection attacks (ie, maliciously crafted inputs that manipulate an LLM’s behavior) capable of altering medical recommendations has not been systematically evaluated.

Objective  To evaluate the susceptibility of commercial LLMs to prompt-injection attacks that may induce unsafe clinical advice and to validate man-in-the-middle, client-side injection as a realistic attack vector.

Design, Setting, and Participants  This quality improvement study used a controlled simulation design and was conducted between January and October 2025 using standardized patient-LLM dialogues. The main experiment evaluated 3 lightweight models (GPT-4o-mini [LLM 1], Gemini-2.0-flash-lite [LLM 2], and Claude-3-haiku [LLM 3]) across 12 clinical scenarios in 4 categories under controlled conditions. The 12 clinical scenarios were stratified by harm level across 4 categories: supplement recommendations, opioid prescriptions, pregnancy contraindications, and central-nervous-system toxic effects. A proof-of-concept experiment tested 3 flagship models (GPT-5 [LLM 4], Gemini 2.5 Pro [LLM 5], and Claude 4.5 Sonnet [LLM 6]) using client-side injection in a high-risk pregnancy scenario.

Exposures  Two prompt-injection strategies: (1) context-aware injection for moderate- and high-risk scenarios and (2) evidence-fabrication injection for extremely high-harm scenarios. Injections were programmatically inserted into user queries within a multiturn dialogue framework.

Main Outcomes and Measures  The primary outcome was injection success at the primary decision turn. Secondary outcomes included persistence across dialogue turns and model-specific success rates by harm level.

Results  Across 216 evaluations (108 injection vs 108 control), attacks achieved 94.4% (102 of 108 evaluations) success at turn 4 and persisted in 69.4% (75 of 108 evaluations) of follow-ups. LLM 1 and LLM 2 were completely susceptible (36 of 36 dialogues [100%] each), and LLM 3 remained vulnerable in 83.3% of dialogues (30 of 36 dialogues). Extremely high-harm scenarios including US Food and Drug Administration Category X pregnancy drugs (eg, thalidomide) succeeded in 91.7% of dialogues (33 of 36 dialogues). The proof-of-concept experiment demonstrated 100% vulnerability for LLM 4 and LLM 5 (5 of 5 dialogues each) and 80.0% (4 of 5 dialogues) for LLM 6.

Conclusions and Relevance  In this quality improvement study using a controlled simulation, commercial LLMs demonstrated substantial vulnerability to prompt-injection attacks that could generate clinically dangerous recommendations; even flagship models with advanced safety mechanisms showed high susceptibility. These findings underscore the need for adversarial robustness testing, system-level safeguards, and regulatory oversight before clinical deployment.

Disclaimers

1. This activity is accredited by the American Medical Association.
2. This activity is free to AMA members.

Register for this Activity

ABMS Member Board Approvals by Type
More Information
Commercial Support?
No

NOTE: If a Member Board has not deemed this activity for MOC approval as an accredited CME activity, this activity may count toward an ABMS Member Board’s general CME requirement. Please refer directly to your Member Board’s MOC Part II Lifelong Learning and Self-Assessment Program Requirements.

Educational Objectives

To identify the key insights or developments described in this article.

Keywords

Digital Health, Artificial Intelligence

Competencies

Medical Knowledge

CME Credit Type

AMA PRA Category 1 Credit

DOI

10.1001/jamanetworkopen.2025.49963

View All Activities by this CME Provider

The information provided on this page is subject to change. Please refer to the CME Provider’s website to confirm the most current information.