In this 3-minute, 20-second video, LDI Senior Fellow Hamsa Bastani explains how her team calculated that Pennsylvania’s COVID-19 positive test rate is actually about 60% higher than is publicly reported.

One of the biggest impediments to gaining more effective control of the U.S. COVID-19 pandemic is the lack of the accurate test data needed to map out the exact density and spread of the disease in any given area.

The is important because infection and location data are key to informing effective pandemic control decision making. For instance, good data defines where social distancing restrictions need to be increased or can be lessened. Accurate test data is also crucial to evaluating various kinds of potential interventions designed to slow disease spread.

Hamsa Bastani, PhD

COVID data void

There are a number of reasons for the chronic COVID test data void in the U.S., including insufficient testing capabilities, long lab processing delays, and lack of geographic access to test-gathering facilities. Another is the fact that most of the limited amount of testing conducted so far has been focused on symptomatic people. Overall, these barriers have “censored” — or failed to quantify — the true number and location of people who have, and can spread, the highly contagious disease.

In its COVID-19 Rapid-Response research project, LDI Senior Fellow Hamsa Bastani‘s team focused on the use of proxy methods to generate more accurate infection estimates. The research was funded by one of thirteen LDI COVID-19  Rapid Response Grants awarded in early May.

Methods of ‘uncensoring’

The project tested methods for “uncensoring” — or more accurately estimating the real rate of infection — by computing an adjusted estimate based on U.S. demographic data matched against the findings in high-testing overseas regions, like South Korea. South Korea has achieved substantial control over the rise of infections in its population at the same time it has not overwhelmed its testing capacity. So, the expectation is that South Korea’s reported number of infections across all demographic groups is relatively accurate.

“With these assumptions, we can uncensor the infection rate by computing the rate at which people in the U.S. are infected compared to South Korea,” explained the researchers.

Using this approach, the researchers were able to calculate the “censored” rate of infection in Pennsylvania and then readjust it to its accurate rate. Applying their statistical adjustment method to the 74,0768 Pennsylvania COVID cases reported since the start of the pandemic up to June 7, they calculated that the accurate number of cases in the state was 117,453, or 60% higher than the publicly reported number.