Kevin Volpp, MD, PhD, welcomes attendees to the University of Pennsylvania Center for Health Incentives and Behavioral Economics’ (CHIBE) 15th Annual Roybal Retreat. Volpp, an LDI Senior Fellow and Director of CHIBE, is also a professor at both the Perelman School of Medicine and the Wharton School. The Oct. 13-14 Retreat drew ten dozen behavioral scientists to Skytop Lodge in the Poconos. (Photos: Hoag Levins)
In a keynote presentation at the 15th annual University of Pennsylvania Center for Health Incentives and Behavioral Economics (CHIBE) Roybal Retreat, Harvard’s Elizabeth Linos, PhD, encouraged behavioral scientists to embrace the idea that AI systems can enable the creation of new kinds of tools that make evidence more usable for policymakers, thus addressing one of the biggest gaps between academic research and real-world implementation.
The annual two-day CHIBE retreat enables scientists from Penn, as well as affiliated investigators from other schools, to share the latest developments and findings of their ongoing behavioral science research. CHIBE is one of 15 Roybal Centers of Excellence across the country funded by the National Institute on Aging’s (NIA) Division of Behavioral & Social Research.
Elizabeth Linos, PhD
Linos, an Associate Professor of Public Policy & Management at the Harvard Kennedy School and Faculty Director of the People Lab, is a researcher focused on the intersection of public management, behavioral science, and evidence-based policymaking. Among her recent papers is “Bottlenecks for Evidence Adoption” in the Journal of Political Economy.
She noted that while researchers prioritize methodological rigor — like statistical significance or randomized control trial (RCT) design — policymakers often care more about cost, feasibility, political risk, and implementation details, which are rarely emphasized in academic papers. She argued that bridging this gap requires not just presenting better data but using AI to build systems that guide policymakers through what they don’t know and what they need to consider.
As an example, she described “Policy Bot,” a prototype in development at Harvard Kennedy School that draws from evidence clearinghouses and uses AI to act as both an assistant and a coach. Beyond summarizing studies, it prompts policymakers to weigh factors like cost-effectiveness and replicability, helping them translate evidence into their own contexts.
Linos emphasized that the challenge isn’t just sharing evidence — it’s ensuring evidence is adopted at scale. She highlighted research showing that statistically strong findings alone don’t predict real-world uptake; simple, incremental interventions are far more likely to be adopted than large, complex changes.
She concluded by calling for a new research frontier: testing what kinds of AI-enabled tools work best for providing evidence-based decision support systems that policymakers trust and actually use.
“Part of making evidence useful for policymakers is making sure that we’re actually answering a question that a policymaker needs an answer to,” said Linos. “Oftentimes that’s not really where we start our research or our science. But it turns out if you take that seriously, not only would you change the questions you’re trying to answer, but you’d also think seriously about what is the kind of information that a policymaker might need or value to make sure that your big idea is actually useful to them. We need a clearer, more streamlined cycle that brings evidence into the hands of policymakers and then policymaker values and priorities into the hands of academia as well.”
Here’s a short description of some of the other speakers and activities:
Alison Buttenheim, PhD, MBA, LDI Senior Fellow, Scientific Director at CHIBE, and Professor in both Penn Nursing and the Perelman School of Medicine, announced the launch of a new Design Core for design strategy for clinical and behavioral research. “Design strategy” is a way of developing new health or behavior programs by treating them like something that needs to be carefully designed—not just tested. Instead of starting with a fixed idea and running an experiment, researchers work closely with the people who will actually use the program, build early versions, get feedback, and keep improving it until the work fits real-world needs.
Shivan Mehta, MD, MSHP, LDI Senior Fellow, Director of the Population Health Lab at the Perelman School and Associate Chief Innovation Officer at CHIBE, discussed his recent study published in BMJ Open of the BE-IMMUNE clinical trial aimed at increasing influenza vaccination. The work targeted multicomponent nudge interventions at both patients and their clinicians, using the interplay between clinician and patient in the decision-making process related to immunization. The work established a blueprint for a visit-based approach to vaccination promotion that may be translated to other adult vaccinations.
Alexander Fanaroff, MD, MHS, LDI Senior Fellow and Assistant Professor of Medicine at the Perelman School, gave a research talk about his ITERATE study published in the American Heart Journal in August. It was a series of four randomized controlled trials testing messaging strategies designed to increase racial and ethnic diversity in clinical research enrollment. It found that the outreach methods studied were inexpensive and highly scalable and might have larger effects when scaled across thousands of potential participants.
Kimberly Waddell’s research talk was about her study published in Contemporary Clinical Trials in January of this year focused on using electronic health record nudges to increase screening for breast cancer. An LDI Senior Fellow, Waddell, PhD, MSCI, is an Assistant Professor of Physical Medicine and Rehabilitation at the Perelman School. Backgrounding the work was the fact that although mammograms are the single most effective strategy for early detection and treatment, U.S. mammogram rates are below national targets with persistent disparities by education, income, race, and ethnicity. The trial used visit-level, multicomponent personalized nudges to increase screening rates.
Mary E. Putt, PhD, ScD, Described a recently completed Food is Medicine study that tested the effectiveness of subsidies in increasing fruit and vegetable purchasing on Instacart by Penn Medicine patients with diabetes. In addition, the study tested the impact of ‘salience reminders’ that each week texted participants the remaining balance they would lose if unspent, the impact of ‘choice architecture’ in which fruits and vegetables appeared first when ordering, and the two interventions combined.
LDI Senior Fellow Rebecca Hamm, MD, MSCE, opened her presentation by noting the racial and ethnic disparities that are a continuing issue in obstetrics. Her project developed digital dashboards that make disparities visible at the individual clinical level, prompting self-reflection about potential biases. Hamm is an Assistant Professor of Obstetrics and Gynecology at the Perelman School and Co-Director of AMETHIST, a hub of the NIH’s national maternal health and pregnancy outcomes initiative.
The “Personalizing Behavioral Interventions: Opportunities and Challenges” panel brought together four top experts in the field: (l to r) Mohan Balachandran, MA, MS, Chief Operating Officer of the CHIBE Way to Health program and Corporate Director at Penn Medicine; Shivan Mehta, MD, MSHP, LDI Senior Fellow and Associate Chief Innovation Officer at CHIBE; Srinath Adusumalli, MD, MSHP, LDI Senior Fellow and Vice President and Chief Health Information Officer at Penn Medicine; and Hamsa Bastani, PhD, LDI Senior Fellow and Associate Professor and Co-Director of the Wharton Health Care Analytics Lab.
Panel moderator Hannah Maynard, MPH, CHIBE Project Manager, guided the discussion through a broad range of intervention strategies and audience questions. A clear consensus emerged that the early success of many text-messaging programs designed to nudge patients toward healthier behaviors has led to oversaturation of that channel, diminishing its effectiveness. Panelists also highlighted three additional challenges: balancing equity with efficiency without creating perceptions of unfairness; developing meaningful ways to measure the long-term impact of personalization efforts; and weighing the risks of responsible AI use against the potentially greater harm of leaving patients to seek unreliable sources on their own.
Athena Lee, MA, a Clinical Research Coordinator at the Nudge Unit, explains her poster project, “Advancing Firearm Injury Prevention Through Community Engagement and Evidence-Based Device Distribution.
Meryl McLean, Clinical Research Assistant in the Penn Medicine Nudge Unit, explains her poster paper, “Pragmatic Randomized Trial of Text-Messaging Telehealth and Contingency Management for Opioid Use Disorder Treatment Engagement: Preliminary Enrollment Data.”
Ritikaa Khana, Research Coordinator and Pre-Doc at the Opportunity for Health Lab (OfH) at the Perelman School of Medicine with her poster of a project analyzing the recent decline in U.S. life expectancy.
In the not-quite-flat poster category is the “A Pilot Randomized Controlled Trial of Food Policies to Promote Healthier Choices at Restaurants,” of co-authors Aaliyah Randall, MA, Clinical Research Coordinator at the Penn PEACH Lab; Juliana Catania, MPH, CHIBE Project Manager; and Eva Fabian, MPH, a Program Manager at Penn PEACH and Opportunity for Health Labs.
Alison Buttenheim ran a workshop whose participants were asked to identify the riskiest assumptions about AI. Among the messages on their sticky notes were: “We’re cooked.” “AI limits creativity and the opportunity for teams to brainstorm together.” “AI will cause cognitive deskilling.” “AI is affecting education, which will produce less useful workers.” “AI will worsen equity issues.” “AI will never be regulated.” “More AI will cause us many, if not more, problems than it solves.” “People will lose their jobs.” “Fears of massive job loss with AI are overblown.” “AI hallucinations impact researchers and the public.”
Closing out the retreat was LDI Senior Fellow M. Kit Delgado, MD, MS, Director of the Penn Medicine Nudge Unit and Associate Professor of Emergency Medicine and Epidemiology. “This has been a great retreat and there’s been a lot of talk of the excitement and concern around AI. Someone said ‘we’re cooked.’ But we need to not forget the fundamentals. As someone that does driving research, I remember all the hype around autonomous cars and the predictions that we’d have them by now. But there was a last mile problem. I think there are a lot of parallels in that and what we’re seeing as AI deploys in medical settings. Is this a bright, shiny new tool or is it often a solution looking for a problem? I would say we should all go out there, do our behavioral diagnostics, test our assumptions, and prescribe the right solution.
Attendees at the 2025 University of Pennsylvania Center for Health Incentives and Behavioral Economics (CHIBE) Roybal Behavioral Science Retreat at Skytop.
Medicare Advantage Modestly Cut Black–White Disparities in Chronic Disease Prevention Compared to Traditional Medicare, but Care Gaps Remain for Latinx Populations