There has been a ton of terrible modeling this year, and the IHME is just one example. This week the CDC published an ensemble of models trained on Covid case data up to March 27.
The models completely mispredict the actual number of cases since that time:
This is a total failure of validation that would make any other practitioner question the whole exercise, but it didn't stop the CDC from publishing them. Nor did it stop the news media from trumpeting the results. The Washington Post and CNBC and others wrote headline articles fawning over them.
No they aren't. 100K was the _lower_ bound of those predictions at one point. Now it's 31K. For models to be useful they have to have at least some sort of predictive power. IHME models are total bullshit that, if the feds 100% believed in it, would have caused a complete disaster due to NY hoarding massive quantities of equipment and supplies they did not need. Federal field hospital in IHME's home state, WA, was taken down the other day. It did not see a _single_ patient. The Javits temporary hospital in NY is also almost empty. NY hospitalizations are way down. ICU admissions are down even more steeply. We won't even get the 60K deaths the models are currently predicting. Not even close.
The uncertainty intervals are pretty confusing even to educated users trying to understand the results in good faith. The entire model used some version of the Wuhan intervention as a prior, and only the uncertainty in fitting curves to that prior is represented in the intervals.
The problem is that the overwhelming majority of the uncertainty in actual outcome doesn't come from the curve-fitting uncertainty, but rather what prior to fit a curve to.
The IHME model seems OK for what it does, but I'm baffled as to how it become the most-cited tool that we have. It's totally inflexible, and pretty confusing in terms of what it actually represents. Its overestimates are now being used as a bludgeon against people that take the virus seriously, which IMO is a major issue and I think wrong, but difficult to refute given how inflexible and confusing the model is.
As evidenced in this epidemic, epidemiological modeling is either not very good or overconfident. I mean, all models are wrong, but theirs have also been useless. Why trust a model if you can do a serology study instead?
It has always assumed that lockdowns would be everywhere and would be observed.
Every new revision in the last 20 days has been a revision downward in death rate.
10 days ago, the 95% confidence interval was 100k-240k deaths with lockdowns observed.
2 days ago it was 45k-145k, with prediction of 81k.
Today is is 37k-137k, with prediction of 60k.
IHME has no incentive to downplay the severity, yet every time the update the model, they have to adjust it downward because reality had failed to keep up with the model.
Meanwhile, reporting of fatalities has actually gotten looser, with all deaths of tested-positive or presumed-positive individuals reported as covid deaths, regardless of cause or comorbidities (that became universal yesterday, and is why yesterday saw a big increase in deaths despite a huge dropoff in hospitalization over the last 6 days)
To be fair, the study you linked suggests 45m have had COVID, and the CDC model assumes 100m have had Covid, of whom 70m were symptomatic. So it actually appears to undermine CDC and support the article.
Sure, that model shows that there are multiple ways to fit the model parameters to account for the first 14 days of fatalities. But we have a lot more real-world data available that directly contradicts any scenario where half the population is already infected.
I'm just disgusted that the authors are now saying it was just an abstract demonstration of different scenarios, and pretending they didn't actually make the claims about the real world that they did.
Because the IHME model is based on taking China's data at 100% face value and applying it to the US. It over-predicts in the short term and under-predicts in the medium to long term. They keep revising the overall number down to line up with an epidemic peak in 3-4 days in the US, which is ridiculous.
The headline ignores the fact that the covid forecasters have changed their model over time to be basically what the paper said to (based on the initial model from March done in great panic). So this story is really nothing. Models improve over time if you have enough time.
> Among the two popular models, we found that the ICL model is more transparent and reproducible compared to the IHME model. The former sometimes over-predicted future deaths while the latter clearly under-predicted post-peak deaths. Both models predicted the timing of peaks reasonably well using data until one week prior. The ICL model produced a much wider band of uncertainty for New York state, possibly because the pattern did not conform well with their internal training data used from European countries.
Nature (on 08 June 2020) reported "a computational neuroscientist has reported that he has independently rerun the simulation and reproduced its results. And other scientists have told Nature that they had already privately verified that the code is reproducible." - https://www.nature.com/articles/d41586-020-01685-y citing the CODECHECK certificate at https://zenodo.org/record/3865491 from May 29, 2020.
You should also be asking why public health is so under-funded world-wide that many countries looked to the ICL model - which in turn was a repurposed influenza model - rather than use local expertise or be able to use better validated models.
> Running the same simulation with the same inputs produced wildly different results between runs.
As I recall, the replication issues were the normal issues related to parallelization and random number generation, and not things which greatly affected the results. There were a number of HN threads on the topic back in 2020. Eg, https://news.ycombinator.com/item?id=23212268 , which linked to a couple of those issues and their resolution.
> I can't think of better example of a situation where requiring replication would have altered the path of humanity as much as it would have here.
These IHME predictions are actually based on deaths, not on tests. Also, please look at the wide error bars on the original data source: https://covid19.healthdata.org/projections
Of course there are still a lot of variables - one of the better criticisms is that it assumes that states lock down with the same strength, just at different dates. But it's the presentation of this Axios piece that is dangerous here: It strips out all the uncertainty estimates and presents exact numbers as if they were precise.
I'm just going to comment on the graph, not the tweets. The graph shows three predictions from IHME.
One, the blue line, is a model from 3/27. It matches the data okay.
The second, the yellow line, is a model from 4/5. The agreement of this line is much worse, and the fact that the model did worse with more data is not promising.
The third, the teal line, is a model from 5/4. The data (black) ends at 5/4. So the agreement of the teal line with the black line is not a prediction at all.
The red line, a cubic fit, is totally irrelevant. By "cubic fit" I infer that they mean some kind of low-pass signal filter. Fitting a simple model like that to a complex time series without some motivation for why that model was chosen is the mathematical equivalent of treating tuberculosis with mercury.
My point is: it sure doesn't look like the models are doing a good job of predicting death rates. And that's just from the graph used to advertise them.
Those model predictions are not for the first 3 months, but for the whole epidemic. We could very likely breach some of those predictions in the next 1-2 years.
Also, the Imperial College paper was one of the more extreme models. It is the one that supposedly scared the White House and lots of states into action. In this way, it was not a best case estimate, but was self-selected to be an outlier.
We (the public) are being misled about either the virulency, the morbidity rate, or both, with regards to SARS-CoV-2. The real numbers simply do not match the models.
> (That's not to let IHME off the hook; once they started promoting the ability for their model to do anything else, they became responsible for its predictions at other stages of the pandemic.)
The article spends a long time complaining about the high "unreported infection" numbers used by CDC models, but doesn't actually address the biggest piece of evidence, which is seroprevalence surveys. For example, https://www.cidrap.umn.edu/news-perspective/2021/01/study-us... claims 15% of US residents had COVID antibodies in mid-November, and if you want to claim a number that disagrees with that, you at least need to address the evidence.
The models completely mispredict the actual number of cases since that time:
https://pbs.twimg.com/media/E0qZTvVVkAIs484?format=jpg&name=...
This is a total failure of validation that would make any other practitioner question the whole exercise, but it didn't stop the CDC from publishing them. Nor did it stop the news media from trumpeting the results. The Washington Post and CNBC and others wrote headline articles fawning over them.
reply