Why are COVID-19 Data so Inconsistent?

Today, June 5, the United States reached the 110,000 COVID-19-related death mark. As a policy analyst and researcher, I follow the data very closely. In fact, I am in Day 80 of tracking new daily cases, new daily deaths, and total deaths for the virus, which I then repurpose into charts on Facebook and Twitter.

The frustrating part is that I’ve found the numbers erratic, leaving me with some concern about data validity and reliability. If we can’t trust the numbers, how can we trust decisions based on the numbers?

Four months into the pandemic, at least from a US perspective, everyone wants to get back to normal, whatever that looks like in the future. Many states have entered “Phase II” of reentry, allowing larger groups of people to get together and restaurants to begin operating on a more normal basis. Of course, the recent riots across the country, and the world, will put COVID-19 to the real test.

The fight to re-open versus the right to save lives is playing out daily in communities across the country. Most of us understand that the reopening process will take time and that there has to be a balance between safety and normalcy. It is understandable why some people are frustrated by isolation when their numbers are so low, but it is similarly understandable for others that are trying to play by societal rules to help flatten and reduce the number of cases and deaths from this tragic virus.

My analysis has found extreme inconsistencies in the data. For instance, the data illustrate that less people die on a Sunday than any other day of the week, and most people succumb to COVID-19 on a Tuesday or Wednesday. The problem is that COVID-19, like all viruses, does not work in a democratic manner; it does not choose what day to kill people. With the sheer numbers across the country, we should be experiencing exceedingly stable trends from day to day with only nominal changes one way or another. The changes I’m seeing aren’t nominal. Sunday counts have been between 17 and 48 percent lower than those from the previous day. In the first graphic below, the red bars represent Sundays and the yellow bars Tuesdays and Thursdays. The second chart merely makes it easier to see the daily trend from Sundays forward. The volatility is enormous and the stability of the trends week by week interesting, if not perplexing.

No alt text provided for this image
No alt text provided for this image

There are other inconsistencies. A few weeks ago we heard about the data inconsistencies in Georgia, which had pronounced a decrease in their numbers after opening up. The Georgia Department of Public Health posted a chart illustrating the daily declines in COVID-19 cases. However, they sorted the data from highest to lowest which made it look like it was decreasing, when in fact it was not. Those data have since been removed from their website.

Similarly confusing are the projections from the University of Washington, which only six weeks ago reported that we would hit 74,000 deaths by August. At the time, my analysis suggested that we would hit that number by the first week in May. Spoiler: my timeline was correct. It should bother everyone that my relatively simple tracking is out-forecasting a center with rich experience in this type of thing. Adding some insult to injury, UW’s School of Pharmacy, a different department of the University of Washington, came out 10 days after the above report with an estimate of between 350,000 and 1.2 million deaths in the US by the end of 2020. That’s quite a spread.

The chart below illustrates the ratio of new deaths versus new deaths on a daily basis. Note the variability of the ratio from day to day. Some recent days are as low as 2.6 percent, equivalent to one death to 38 cases, and as high as 9.9 (one death for every 10 cases). This irregularity illustrates the troublesome data. For ratio calculations, I surmise that the cases are more problematic than the deaths.

We should be very concerned about the systems in place to track COVID-19, which in turn determine public policy as well as inform individual behavior. A system that is robust and in parallel across the country would be consistent to a fault. But that’s not where we are. These inconsistencies, and especially regular inconsistencies, tell us that the counting and delivery of data isn’t what it needs to be for something of this grave importance. My bet is that numbers from Sunday and Monday get pushed to Tuesday and Wednesday. Is this due to staffing configurations? I’m not sure, but it is an egregious error for the agencies in charge of collecting this information. By this point in the pandemic, there should be extraordinarily rigid and rigorous methods of logging data at the federal, state, and local levels. There is no reasonable excuse for not doing so by this point.

For the nation to move forward, we need to be confident in the data from every area of the country. We should feel good that the latest numbers in New York illustrate a large downward trend, but concerned that other states, like Texas, are perking up.

The federal government, including the NIH and CDC, need to step up and take a look at these data trends and provide better models for looking into the future instead of relying on models that have been proven unworthy. With better data will then come more prudent decisions.

Watson Scott Swail is the President and Senior Research Scientist with the Educational Policy Institute in Virginia Beach, Virginia.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.