Opening the two-day 2025 Nudges in Health Care Symposium, M. Kit Delgado, MD, MS, LDI Senior Fellow and Director of the Penn Medicine Nudge Unit, welcomed more than 300 attendees from 30 health systems, universities, and other health care organizations to the seventh annual symposium. (Photos: Hoag Levins)

The latest wave of health technology transformation was set in motion by the 2009 HITECH Act, which mandated and incentivized the adoption of electronic health records (EHRs). Now, those rapidly advancing systems are spawning a second revolution—evolving into massive “nudge” engines that weave human clinicians and machine intelligence together in real-time decision-making.

Rapidly emerging research is demonstrating AI large language models can perform as well or better than clinicians alone or clinicians with AI assistance in medical reasoning and decision making. This new trend raises the stark question for clinicians as to whether they should remain “in the loop,” controlling final decisions, vs. relinquishing some control with some continuous oversight. That tension was at the heart of the discussion at the 2025 Penn Medicine Nudges in Health Care Symposium on how AI can be leveraged to improve clinical decision support.

Sharing Authority

The reach of AI is already vast throughout health care, touching logistics, economics, diagnostics, and treatment. But future success, keynote speaker Adam Rodman, MD, MPH, argued, will hinge on how willing medical professionals are to share their authority with algorithms that in many domains are already proving more accurate than humans in complex decision-making.

Rodman, Associate Editor of the New England Journal of Medicine, is widely regarded as a leading voice in the AI and clinical decision support field. He directs AI programs at the Carl J. Shapiro Center for Education and Research, affiliated with Harvard Medical School and Beth Israel Deaconess Medical Center, and is a visiting researcher at Google.

In his keynote speech, Adam Rodman, MD, MPH, discussed how AI is rapidly moving from the margins of medicine into the center of clinical decision making, posing stark questions about how much authority physicians may have to give up to optimize AI-driven decision support systems that are more accurate than the doctors are.
Rodman pointed to the different characteristics of Star Wars AI robots R2-D2 and C-3PO to illustrate the differences between in the loop and on the loop methods of integrating AI systems and clinicians into the decision support process.

Rodman challenged conventional thinking around AI in healthcare, particularly the assumption that human collaboration—or “humans in the loop”—is inherently better than systems where humans provide oversight from a distance (“humans on the loop”). He argued that human-in-the-loop models can lead to cognitive deskilling, inefficiency, and poor scalability in clinical settings. At the other extreme, fully autonomous AI systems—where humans are entirely out of the loop—raise dystopian concerns reminiscent of science fiction, and are clearly not the solution either. Instead, he advocated for a “human on the loop” approach: allowing AI to perform tasks it excels at, while humans retain supervisory control.

He acknowledged that realizing this vision comes with significant socio-technical challenges, including the absence of a robust regulatory framework and unclear payment models. Importantly, he reminded the audience that while AI is powerful, it is not magic—its success depends on thoughtful integration, oversight, and trust.

Letting AI Run by Itself

“When I ran two randomized controlled trials,” he explained, “I started getting hate mail attacks in my office because I showed again what every single researcher in this field has shown for decades, which is that just giving a human an AI system doesn’t inherently improve that human’s performance, and it doesn’t improve their performance as much as if the AI system had just run by itself. We’ve now seen this in diagnostic reasoning, management reasoning, and in breast, radiology, and chest X-ray areas. We see it in field after field.”

“Generative AI is different because it can have interactions with patients. About two years ago, Google developed a system called the Articulate Medical Intelligence Explorer (AMIE) that talks to patients,” Rodman continued. “This is an evaluation of standardized patients. What they found is that in standardized patients, when it’s blinded between doctors and the AI system, the AI system has equivalent or better diagnostic accuracy. And it’s basically rated better by the patients in every domain.”

As an example of health systems that are already using AI to interview patients and prepare a summary and initial treatment recommendations for physician review, Rodman cited the K Health system at Cedars-Sinai Los Angeles. It collects patient symptoms, asks follow-up questions, and cross-references the input with electronic health records and data from other cases.

AI Beats Cedars-Sinai Physicians

A study published earlier this year in the Annals of Internal Medicine compared the Cedars-Sinai AI treatment recommendations against those of physicians and found that 77 percent of AI recommendations were rated as optimal, while only 67 percent of physician decisions were rated optimal.

“This is not science fiction,” said Rodman. “We are deploying these patient-facing AI systems. The performance paradox is a problem. Scalability is a problem, but the health care system is in a lot of trouble because there are a lot of sick people and not enough doctors.”

A performance paradox refers to the fact that success in developing AI components for clinical support does not guarantee improved real-world care. The complex play of system integration, compatibility, and clinician interaction is pivotal to ensuring AI achieves its potential benefits in health care settings.

One of the most famous examples of the performance paradox was the 2009 Air France 447 disaster when the pilots overruled the autopilot that was telling them to level out the Airbus A330. They kept climbing until the plane stalled and crashed into the ocean, killing everyone onboard.

Overruling the Algorithm

“We see this over and over again,” said Rodman. “In quantitative trading, whenever the traders overrule the algorithm, they actually do worse. We’ve now seen this in chess and Go board games, sentencing support, weather forecasting, and industrial quality control.”

“One of the big challenges and worries I have is that we might end up with this really powerful technology that doesn’t make the difference that we want, because we’re not able to move from human in the loop to human on the loop,” said Rodman. “What is the biggest problem? Liability. Now, it’s all with the individual provider. There is not currently a system to shift liability. And because of that, companies are not making investments in this technology. It requires partnerships that health systems don’t really allow for a tech company now. But you cannot build a system that goes from human in the loop to human on the loop because you must deploy a human in the loop system first and then iterate to safely get it to human on the loop.

Behind the Curtain

“As I said earlier, tech is remarkably powerful, but it is not magic,” said Rodman. “And just because you have a general-purpose diagnostic or measurement system that can outperform humans doesn’t mean it’s going to make anything better, because the challenges are not just technical, they’re sociotechnical. They involve the way that we deliver care. And this is something that I think the people in this symposium and this specialty are very well positioned to figure out. That’s going to take years—a decade of very, very careful and deliberate work and advocacy to make this happen knowing there is no tech miracle. There is no superintelligence hidden behind the curtain like in ‘The Wizard of Oz,’ right? In order to make this happen, we have to do the work.”

These are some other scenes from the symposium:

Raina Merchant, MD, MSHP, LDI Senior Fellow, and Vice President and Chief Transformation Office, University of Pennsylvania Health System, noted that: “Nine years after Penn Medicine created the first behavioral science ‘nudge’ unit inside a health system, nudges have become critical to our core business of improving outcomes, reducing burdens, and transforming health care delivery.”
The two-day event’s presenters discussed the latest insights related to nudge methodologies that could shift the behavior of patients and clinicians in ways that improved outcomes. LDI Senior Fellow and Nudge Unit Director M. Kit Delgado, MD, MS, pointed out that one its recent projects changing EHR defaults led to 20 percentage point increase in 90 day prescriptions for statins, which has been shown to increase adherence and decrease mortality..
Keynote speaker Katherine Milkman, PhD, LDI Senior Fellow, Wharton School Professor, and Co-Director of the Behavior Change for Good Initiative, discussed why large-scale mega studies — coordinated investigations testing many interventions simultaneously — are essential for accelerating learning, improving outcomes, and avoiding bias in evidence.
In the “Nudging Patients to Better Outcomes” panel, Kimberly Waddell, PhD, MSCI, LDI Senior Fellow and Perelman School Assistant Professor, panelists from Permanente Medical Group, Geisinger, Penn Medicine, UCLA, and Ascension presented large scale RCTs of interventions to engage patients. Some studies included sample sizes in the millions of patients.
LDI Senior Fellow Kimberly Wadell presents an study showing that pended orders and patient text-messaging nudge significantly increased mammogram completion. Other panelist shared their work on testing human-versus-electronic prompts, multichannel EHR/email nudges for safer diabetes management, and real-time prescription benefit alerts that track patient-specific drug costs.
LDI Senior Fellow Meeta Kerlin, MD, MSCE, an Associate Professor of Pulmonary, Allergy, and Critical Care at the Perelman School, presented on a clinical nudge designed to promote utilization of low tidal volume ventilation (LTVV). LTVV refers to lowering the amount of air provided to minimize injury from lung stretching during mechanical ventilation.
Sunita Desai, PhD, a former Associate LDI Fellow and now an Assistant Professor of Population Health at NYU Langone Health, discussed how a real-time prescription benefits (RTPB) nudge led to reductions in patient out-of-pocket costs for ordered medications, with as much as a 40 percent reduction for high-cost drug classes.
The AI/EHR panel, moderated by LDI Senior Fellow and Penn Medicine Chief Health Information Officer, Srinath Adusumalli, MD, MSHP, emphasized that embedding AI into EHRs like Epic requires balancing innovation with safety and trust while leveraging new platforms for predictive care, ensuring transparency and customizability for clinicians, and addressing alert fatigue and bias.
Moderator Adusumalli emphasized the critical questions raised by the rapid integration of AI into clinical decision support. “How do we assess the safety, utility, and effectiveness of these tools? What standards do we compare them to? How do we best design our user interactions with them? What are the medical, legal, regulatory, and compliance implications, and how should the care delivered with them be reimbursed?” he asked.
Panelist Carissa Kathuria, BSE, R&D Group Lead at Epic Systems Corporation, described the company’s massive new community-based data platform called Cosmos, that contains 200 million de-identified patient records. It is being paired with new architecture to create predictive models that can anticipate a patient’s “next event,” much the same as large language models anticipate the next word.
The Symposium’s poster sessions featured 48 posters across a broad mix of nudge-related studies ranging from “Transforming Hypertension Management in Primary Care,” and “Relief Is Just a Text Message Away: Gastrointestinal Symptom Check-In Post Living Liver Donation,” to “The Experience of African American Adults with Low Health Literacy When Accessing Health Care,” and “Reducing Patient Falls in Community Radiology.”
Ashley West, PhD, Director of Behavioral Design at Lirio (left), discusses her poster, “Designing an ‘At Home’ Digital Health Intervention for Supporting Chronic Condition Lifestyle Management,” with LDI Senior Fellow Renée Betancourt, MD, an Associate Professor of Clinical Family Medicine and Community Health at the Perelman School of Medicine.

Author

Hoag Levins

Editor, Digital Publications


More LDI News