The COVID Tracking Project is showing a significant drop in reported new COVID cases with the seven day moving average peaking at about 213,000 new cases per day on December 17th down to 178,000 new cases per day on the 7 day moving average on December 29th.
Our daily update is published. States reported 1.2 million tests, 195k cases, 124,686 hospitalizations, and 3,283 deaths. Holiday reporting delays are still markedly affecting testing, case, and deaths figures. pic.twitter.com/oRB8RvvvBA
— The COVID Tracking Project (@COVID19Tracking) December 30, 2020
This is great news if it is not a data quality problem?
There is a pretty good chance that this is a data quality problem and it is a predictable data quality problem. We had a huge and well predicted discontinuity event that is likely messing up the data. Christmas happened. Testing dropped by at least 15% in the same time period.
We see this notably in North Carolina data. On 12/28, only 19,000 tests were reported back to the state with 3,888 people reported as positive on that day as well. On 12/21 (to control for day of the week seasonality) 4479 people were reported as newly positive on ~37,500 tests. Case counts went down significantly even as positivity rates increased by a lot in North Carolina.
What does this mean?
We have a change in the marginal person getting tested. The over-exaggerated version of the story is that in mid-December, people who even thought they might have had an exposure or the sniffles or any reason to suspect that they might have COVID got tested. The positivity rate in North Carolina would still be high which implies a lot of people who are positive and infectious but non-symptomatic were out and about, but the marginal person getting tested had a good probabilty of being negative.
This story changed. Very few people got tested on Christmas Eve, Christmas Day or the day after Christmas. Again, exaggerating, but the only people getting tested on these days were already running a fever and hacking up a lung. The marginal person getting tested over Christmas is far more likely to be positive. This is driving up the positivity rate significantly even as the reported case count is dropping because the drop in testing overwhelms the increase in positivity rate. We are likely seeing a lot more people with either no or modest symptoms walking around without knowledge that they are positive right now.
We are likely to get another testing drought between New Years Eve through January 4th as people tend to not get tested as frequently on weekends plus the holiday will reduce testing. We probably won’t get good data that is comparable to mid-December data until the second week of January.
We do have some data that is higher quality. Our hospitalization data is decent. It has two major challenges. First, hospitalizations lag infections. Today’s record hospitalizations are a reflection of case counts in mid-December. Secondly, hospital admission data also has a changing marginal problem. When there is plentiful capacity, a patient who is a flip a coin decision to admit or keep for another night instead of discharging to home is probably a lot healthier than the patient who is a flip a coin decision to use one of the last three available beds in fifty miles. Patients who would have been admitted when daily hospitalizations were only running at 30,000, 40,000, 50,0000 hospitalizations per day probably are not be admitted when hospitalizations are running at 110,000+ hospitalizations per day. Sewage system data is reliable and near real time. But it is not universal.
So when we think about COVID data, we need to know about external shocks that change the nature of the current marginal in relationship to the previous marginal and also think about lags. Thinking about the quality of the data and its quirks helps with making effective interpretation.