Atul Gupta, PhD, is an Assistant Professor of Health Care Management at the Wharton School, and an LDI Senior Fellow.

Why do studies on the same topic reach different conclusions? In my course on health care research methods, we explore the different choices made along the way and how they lead to different results.  Recently, studies have reached different conclusions about the impact on mortality of Medicare’s Hospital Readmissions Reduction Program [see box below]—and one of those papers is mine. As I prepare for teaching, I’ve made some notes that I thought I’d share, as I understand there has been considerable online chatter on this topic recently.  Don’t hesitate to reach out if you have comments or suggestions.

Since 2012, the federal government has penalized hospitals about $500 million each year under the Hospital Readmissions Reduction Program (HRRP). It is one of the largest performance pay programs in US health care, and has attracted substantial attention from hospitals, policymakers and researchers. An important concern with performance pay design is that of unintended consequences, particularly any that might harm patients.

About the Hospital Readmissions Reduction Program


The Affordable Care Act (ACA) authorized the federal government to penalize hospitals for poor readmission rates for Medicare patients with certain conditions, under a performance pay program known as the Hospital Readmissions Reduction Program (HRRP). The penalty formula and other details were announced in August 2011, and the first penalty was imposed in October 2012. This program imposes  high powered incentives on hospitals to reduce their 30-day readmission rate for specific targeted conditions, beginning with acute myocardial infarction (AMI, or heart attack), heart failure, and pneumonia. A 1% increase in baseline readmission rate beyond the penalty threshold causes the penalty to increase by 1% of the hospital’s revenue from the targeted condition. This implies that a single readmission may result in a penalty approximately 5-6 times the mean reimbursement from a hospital stay for the targeted condition. Since average reimbursement across these three conditions was about $10,000 per stay, each readmission attracted a large penalty.

A recent JAMA article by Wadhera and colleagues has heightened this concern. They investigated trends in 30-day mortality for patients hospitalized with conditions targeted by HRRP: acute myocardial infarction (AMI), heart failure, and pneumonia. Their empirical strategy is to examine changes in mean mortality rates across different periods and test if these changes are statistically distinguishable. They exploit the timing of announcement of HRRP (March 2010, as part of the ACA) and implementation of its penalties on hospitals (October 2012) to define the different time periods (two prior to announcement, one from announcement to implementation, and one post-implementation). Accordingly their data (Medicare claims on all hospital stays) spans from 2005 through 2015. A key aspect of their research design is that they aim to quantify changes in aggregate mortality rates across all patients, rather than testing whether performance has differentially changed at hospitals targeted by HRRP penalties (which was the goal of the program). The authors also clarify this approach does not recover causal effects of the program.

Their ‘headline’ result is that HRRP announcement and implementation is associated with a statistically significant increase in mortality for heart failure and pneumonia patients, while for patients with AMIs, announcement is associated with a decline in mortality, and no change is associated with implementation. This result has attracted significant media attention, since it suggests that high powered incentives to decrease hospital readmissions may have inadvertently led to an increase in patient mortality. Hence, this would be a serious unintended consequence of a program that was introduced to improve quality of care. Observers pointed out that the results of this study contradict those by some other parallel studies by Khera and colleagues and a mandated MedPAC analysis. In particular, their conclusions contradict my own results presented in an as yet unpublished paper, currently being revised for re-submission. Naturally, these conflicting results generated some confusion and commentary. To help alleviate these concerns, I briefly describe my study in this note and discuss some factors that may be driving differences in results.

The goal of my study was to quantify hospital responses to the imposition of performance pay incentives in the context of HRRP and resulting effects on patient health. HRRP provides a useful ‘natural experiment’ since about 50-60% of eligible hospitals are penalized each year across the three conditions, while the remaining hospitals receive no penalty or bonus payment. Accordingly, a reasonable empirical approach would compare trends in various outcomes at penalized hospitals vs. those at non-penalized hospitals, before and after the 2011 announcement of HRRP’s penalty details. This approach uses non-penalized hospitals as counterfactuals (i.e., it assumes that they provide a useful approximation of the trend for penalized hospitals in absence of the penalty). This is a well-established research design in economics, and is known as differences-in-differences.  

However, this approach is too simplistic and likely to generate biased estimates for two reasons. First, there is no true ‘control’ group in this setting. All hospitals are eligible for penalties, and hence they may change their protocols and staffing in anticipation of the penalty, even though they are not penalized in a given year. I overcome this limitation by computing hospital expectations of being penalized based on past performance, and use these predicted probabilities rather than actual penalty status as the continuously varying penalty ‘dose’ across hospitals. Second, using penalty probability directly in regression analysis could over-estimate effects of HRRP due to the possibility of mean reversion (i.e., hospitals may have been penalized due to an unfortunately timed temporary upswing in their readmission rate above their ‘true’ mean just as the penalty rates were first determined). When these hospitals revert to their true, lower mean in the future, it will appear as if the penalty motivated them to improve. To overcome this concern, I use an instrumental variables approach where I rely on variation in hospital quality and patient demographics in 2006-07 — before the ACA — to generate variation in penalty probability under HRRP. This approach assumes that true hospital quality is stable over time and using historical features to predict penalty probability eliminates the role of temporary swings. This approach also has an important benefit that it eliminates other possible sources of measurement error in computing hospital expectations. I apply this approach to estimate effects of HRRP on various outcomes, including mortality.

I compute 30-day and 1-year mortality, including in-hospital deaths, since this is a sizeable component of total mortality for these three targeted conditions (nearly 50% in case of 30-day heart attack mortality). Another important benefit is that this measure is robust to changes in hospital discharge decisions in response to the penalty. For example, if hospitals have improved inpatient care quality and reduced in-hospital mortality, the average patient discharged alive may be sicker and post-discharge mortality may increase. I find improvement in mortality for AMI patients, while there is no meaningful effect for heart failure and pneumonia patients. In case of AMI patients, the IV estimates indicate an effect on 30-day and 1-year mortality of about 0.45 percentage point (significant at 10% level) and 1.3 percentage point (significant at 5% level) respectively. This represents a stable 2-3% decrease in mortality across durations.

Why have the studies produced such different results?  There are several differences between the two studies, but I’ll highlight two in particular that may be key.  

In addition to these important methodological differences, sample construction also differs. This is probably a minor issue, but it may also contribute to differences in results. My sample excludes Critical Access Hospitals (CAHs) and hospitals with fewer than 50 index cases over July 2008- June 2011, which was the baseline period to determine the first penalty rates. This was done to exclude smaller hospitals, where mortality and readmission rates fluctuate substantially from year to year. These are also excluded from HRRP penalties for the same reason. It is not clear if Wadhera and colleagues exclude hospitals not subject to HRRP penalties, such as those located in Maryland, or critical access hospitals. As a side-note, it would be interesting to see results when their analysis is replicated on this subset of hospitals. To the extent these changes are due to the HRRP, we should not find a similar increase in mortality at hospitals exempt from the penalty.

So what is the ‘final word’ on the effects of the HRRP? It will probably take some more time and other independent studies before the consensus truly forms. There is room for many studies on a policy issue as important as performance pay in health care. This episode illustrates one of the frustrations of empirical research, particularly observational studies – research design choices are consequential and can lead to differing conclusions. However, differing studies also play a critical role in generating informed debate and moving science forward. As researchers, it is important for us to analyze these differences and clearly communicate to stakeholders.