Artificial intelligence is moving so quickly into medical education that schools can no longer treat it as an optional tool, according to experts at a University of Pennsylvania virtual panel discussion sponsored by the Leonard Davis Institute of Health Economics and the Perelman School of Medicine. Panelists from Stanford, Northwestern, and NYU said AI is already reshaping how students learn and how faculty teach, creating an urgent need to build faculty competence, set guardrails, and train future physicians to use these systems safely and accurately.

Moderated by Jennifer Kogan, MD, Vice Dean for Undergraduate Medical Education at the Perelman School, the panelists were Holly Caretta-Weyer, MD, MHPE, Associate Dean of Admissions and Assessment at the Stanford University School of Medicine; Brian T. Garibaldi, MD, MEHP, Director of the Center for Bedside Medicine at the Northwestern Feinberg School of Medicine; and Verity Schaye, MD, MHPE, Assistant Dean for Education in the Clinical Sciences and Assistant Director for Curricular Innovation at the NYU Grossman School of Medicine.

Jennifer Kogan, MD

Toward an Evidence-Based Approach

As she opened the session, Kogan emphasized the scale and complexity of the task the nation’s medical schools now face. “For medical educators, AI brings both exciting opportunities such as personalized learning and improved assessment, and significant challenges: ethical considerations, bias mitigation, regulatory and policy implications. Today’s discussion is about understanding these dynamics and shaping a thoughtful, evidence-based approach to integrating AI into medical education.”

Initially, as AI evolved in health care settings, it was focused on administrative and back-office operational tasks but quickly broadened into clinical settings to support the direct delivery of care. These new clinical AI systems are designed to assist clinicians at the patient bedside, helping them interpret data, make decisions, predict risks, streamline workflows, and improve patient outcomes. Their functions include AI-assisted imaging reads, diagnostic decision support, risk prediction models, and automation of routine clinical tasks.

In recent publications, the Food and Drug Administration, the World Health Organization, and the National Academy of Medicine point out that in current practice and regulation, AI systems do not have professional agency, accountability, or judgment. Clinicians remain fully responsible and liable for decisions; AI systems provide inputs—predictions, pattern recognition, and suggestions—that clinicians evaluate, accept, reject, or reinterpret.

Pros and Cons

The benefits of clinical AI are generally viewed as earlier and more accurate detection of disease, faster decision support at the point of care, reduced clinician cognitive load, and enhanced potential standardization and equity of care.

The disadvantages include the risk of clinicians becoming overly dependent on AI recommendations; drift in model performance over time; algorithmic bias; workflow disruption caused by poorly integrated AI systems; and lack of explainability because many AI systems are black boxes that cannot clearly show how they arrived at a recommendation.

The panelists agreed that AI’s greatest promise lies in improving clinical reasoning and assessment, strengthening bedside skills, and reducing diagnostic error. They were optimistic that the model was a tool-user relationship and not a replacement strategy, comparing it to the advent of point-of-care ultrasound that augmented bedside exams rather than supplanting the clinicians who performed them.

New Research Needed

They also stressed the urgency of getting medical schools up to speed on this broad and complex change occurring across all of health care and warned that medical education has historically adopted new innovations without adequate research to produce supporting evidence. They argued that the spread and power of AI demand much more in-depth research and real evaluation of how medical students are taught, including:

Policy Priorities

Thier policy priorities and guardrails discussed included:

Verity Schaye, MD, MHPE

Recommending a new resource helpful in supervising and guiding the safe, thoughtful use of AI in clinical and educational settings, Schaye pointed to the DEFT-AI (Diagnosis, Evidence, Feedback, Teaching, AI engagement) framework outlined in a recent New England Journal of Medicine article. It is designed to foster clinician judgment, prevent overreliance on AI, and embed critical thinking into AI-augmented practice and education.

Students Know AI Better Than Instructors

The panel’s strongest message was that medical educators cannot ignore AI and must develop personal competency in the technology. One of the biggest challenges to incorporating AI into every level of medical education is that medical students tend to be far more competent in AI than their faculty instructors.

“We really need a great deal of faculty development around AI because the trainees that are up and coming are using AI throughout their educational experience,” Schaye said. “Right now, that’s undergrads; the ones who have used it since K through 12 are going to be coming soon enough, so we’ve really got to get our faculty up to speed on this.”

Holly Caretta-Weyer, MD

Caretta-Weyer pointed to a recent event with her competency committee. “We had faculty giving feedback to residents that they should be using AI to generate their differential diagnosis, but several members of the committee said, ‘Hold on. Hold on. Do we actually want them to use AI to do that?’ We just took a time-out and said, ‘Get over all your angst on this. They’re going to be using it. What we need to do is help them do that appropriately, helping them understand the workflow and the thought process. So how do we teach them to use AI responsibly and then put appropriate guardrails up?’”

Garibaldi said, “We need to be using these tools ourselves. We need to be gathering experience in our own clinical practice, in our own educational practice, so that we begin to understand how you can use these tools, but probably most importantly, where things can go wrong.”

When AI Gets it Wrong

Brian T. Garibaldi, MD, MEHP

He pointed to an example of the ways AI can get it wrong: A neural network system was trained to identify pneumonia on a chest X-ray, but it also figured out which hospital had the highest prevalence of pneumonia. Then, apparently in its own effort to create a shortcut, the system focused on the markers on incoming X-rays that indicated which hospital they came from. When it saw markers from high-prevalence hospitals, it determined that those images probably showed pneumonia, even though it was not reading any of the information from the lung window.

“We need to be talking about these things,” Garibaldi said. “We need to be using clinical reasoning tools on rounds with our patients and with our learners. And we need to be explicit about where things might go off the rails and make sure we talk about the biased data sets that are very possibly going into some of these models.”

Need for Transparency

As the session wrapped up, Kogan noted, “There have been some questions about AI transparency and whether we as teachers should be transparent when we are using it, or whether learners or applicants should be transparent when they are using it. Any quick thoughts about that as we close?”

— Schaye: “Yes. The things we discussed about role modeling or how to use it in assessment. The only way you’re going to build trust in the system is having transparency and talking about it.”

— Garibaldi: “We always try to model how we deal with uncertainty, and one of the most powerful things you can say as an attending on ward rounds is, ‘I don’t know.’ But now we’re in a position where we could be saying, ‘Let’s look that up together,’ and involve the AI tool in the right context. We should absolutely be transparent and be figuring it out together with our learners and be comfortable with saying to them, ‘You guys are probably better at this tool than I am, so let’s work together. You teach me the prompts, and I’ll give you a little bit more of the context information that I’ve gathered from being in the room with the patient.’”

Continuing, Garibaldi added, “It’s important to remember that we’ve had disruptive technologies in medicine before and we’ve had this same conversation: How do we train our trainees to use this new technology when only a fraction of faculty are themselves comfortable with it? We need to remember how we’ve tackled these problems in the past as we try to come up with similar solutions moving forward.”


Author

Hoag Levins

Editor, Digital Publications


More LDI News