Projecting the trajectory of the COVID-19 pandemic: A review of available tools
This page in:
By now, most of us have become used to the frightening daily drip of numbers about the COVID-19 pandemic. The number of new cases and deaths, together with cumulative figures, are reported by the media all over the world. Many are familiar with the Johns Hopkins University map and dashboard which is frequently updated. The European Center for Disease Prevention and Control maintains a similar tool. The JHU tool offers numbers at the subnational level for some countries, which is useful as the spread of the disease varies greatly within country. An Excel sheet extracted from the ECDC data with the daily number of new reported cases and deaths by country – very useful for any cross-country comparison or analysis - can be downloaded in two clicks from the Our World in Data platform (daily data tables). Both tools, together with the raw data from Johns Hopkins, and a large array of other COVID-19 relevant indicators and data sets can be accessed through the COVID-19 page from the World Bank Data Group. You can also learn more about the World Bank COVID-19 response here.
Current reported cases and death numbers are tricky to interpret
As we look at those numbers, we should remember that the current number of reported cases and deaths are indicators which are difficult to interpret. The current number of reported cases is problematic because, in most settings, it relies on testing individuals who have symptoms and present themselves to health facilities which have testing capacity: it misses people who have symptoms but do not come to health facilities where tests are done, and more importantly it misses people who are asymptomatic. A few countries have deployed testing to a much larger segment of the population, but, currently, testing rates vary widely across countries (see figure 3 from this blog post by Dawoon Chung and Hoon Sahib Soh): from 26,772 per million people in Iceland to 9.5 per million in Pakistan (with 6,148 in South Korea, 3,499 in Italy and 314 in the US for example, data as of 3/20/2020). This calls for increased testing across the world, if possible, in representative samples of the population, to better track and understand the epidemic. Given that testing varies widely across country and is not random, reported deaths might end up being a better indicator of the epidemic’s progress than confirmed cases. The problem with deaths as an indicator is the lag between the onset of symptoms and death: on average, the number of deaths informs on the number of infections about 20 days ago, but meanwhile the number of infections is likely to have grown rapidly. This Khan academy video makes this point very intuitively. Also, using deaths might be more problematic in developing countries where many COVID-19 deaths might not be diagnosed or reported as such because of the lack of testing.
Projections are needed to inform policy
If current reported cases and deaths numbers are difficult to interpret, how can countries, especially in the developing world, prepare to confront the pandemic? Ideally, one would like to be able to predict the course of the pandemic over time. And we would also like to know how different sets of containment measures are likely to affect the scale and trajectory of the disease to be able to plan them ahead, trigger them as early as possible to contain the outbreak in the country before it is generalized, and modulate them as efficiently as possible given the disruptions and economic costs imposed by physical/social distancing.
At this point, a disclaimer is called for: I am an applied microeconomist with a focus on health, but I have not been trained as an epidemiologist and for most of my career I have been very reluctant to use, let alone offer, predictions or projections. But over the last few weeks, as I have interacted with teams in developing countries trying to understand the challenges that they are facing, I have tried to figure out how best to put these tools to work.
The basic tool is the Susceptible-Infected-Recovered (SIR) model. This post is not the place to describe it. To learn more about this type of models I suggest having a look at a set of slides and videos on Paolo Surico and Andrea Galotti’s COVID-19 page or for a more formal description at Andrew Atkeson’s NBER working paper.
How will projection parameters vary across country characteristics?
As we consider how the trajectory of the pandemic can vary across the world, for example between high income and lower income countries, a few variables are likely to matter.
First, the age-specific infection fatality ratios (IFR) and the number of deaths per people infected could vary across countries. Because this is a new disease, we have only limited data available. Most projection models currently use estimates from China, since this is an outbreak that, for now, seems to have run its course and extensive data has been collected (see for example the estimates from Verity et al. 2020). Of course, nothing guarantees that the age-specific IFRs will be the same across settings and the caveats I made earlier about how infections and deaths are counted apply to the available estimates. For example, Eran Bendavid and Jay Bhattacharya have warned that the current numbers for the IFR might be substantially overestimated because of the strong selection bias in testing.
What seems clear is that the IFR varies greatly with age as older people are at much higher risk of dying. Since countries’ age structures vary considerably, any reasonable projection needs to take them into account. They are easy to obtain from the UN World Population Prospects site. Given the younger average age distribution of populations in developing countries, projections will usually predict a lower incidence of severe disease, hospitalization and deaths in those contexts. But it is important to be careful here, because it is not yet clear whether the higher morbidity and mortality among old people mainly comes from the fact that older people have other health conditions or from the fact that immune systems generally weaken with age. It is probably a combination of both. The data from China shows that mortality rates increase steeply with age but are also strongly associated with comorbidities (preexisting conditions such as cardiovascular disease, diabetes, chronic respiratory disease, hypertension and cancer).
The rate of social mixing between people and crucially the rate of mixing across generations will also vary across countries and is expected to be higher in the developing world, in part because it is quite frequent there to have several generations living in the same household. The rate of social mixing determines what epidemiologists call R0, the basic reproduction number, i.e. the average number of people infected by each infected person. It is a crucial parameter in any SIR model, and it is the number that containment and physical distancing measures try to reduce. There is still quite a lot of uncertainty about estimates of R0 for COVID-19, but, in the absence of any mitigation measure, it is currently estimated to be between 2.4 and 3.3. Most epidemiological projection tools allow to vary R0, both the baseline number and to account for containment measures.
Finally, hospital, especially intensive care unit (ICU) capacity and overall quality of care varies widely across countries. This is also a key parameter, in particular to project the number of deaths, because a substantial part of the mortality is expected to occur if and when the number of cases in need of (intensive) care will exceed existing capacity and overwhelm the health systems.
Imperial College COVID-19 response team country-level projection results
On March 26, the Imperial College COVID-19 response team released a paper projecting the global impact of the COVID-19 pandemic under different strategies for mitigation and suppression.
The data sources appendix links to an Excel table including the projections results for most countries under 5 scenarios:
- An unmitigated epidemic – a scenario in which no action is taken.
- Mitigation including population-level social distancing – assessing the maximum reduction in the scale of the epidemic that can be achieved through a uniform reduction in the rate of social mixing, short of complete suppression.
- Mitigation including enhanced social distancing of the elderly – the same as scenario 2 but with individuals aged 70 years old and above reducing their social contact rates by 60%.
- Suppression, i.e. the implementation of wide-scale intensive social distancing (modeled as a 75% reduction in interpersonal contact rates) with the objective of rapidly suppressing transmission and minimize near-term cases and deaths. They explore different epidemiological triggers (deaths per 100,000 population) for initiating the suppression strategy:
- Scenario 4a: Suppression triggered at 1.6 deaths per 100,000 population per week
- Scenario 4b: Suppression triggered at 0.2 deaths per 100,000 population per week
Globally, they estimate that a completely unmitigated COVID-19 epidemic would lead to 7 billion infections for a R0 of 3.0 (range 2.4-3.3). Applying estimates of the age-specific IFRs from China, this could result in a staggering 40 million deaths (range 35-42).
The figure below displays the estimated number of deaths globally under the five scenarios considered.
The estimates from the Imperial College COVID-19 response team account for the age structure of the population of each country. They have also estimated and modeled the rates of social mixing and contacts across generations using existing household surveys, but because such surveys were not available for each country, they have used an estimate from a country in each region and applied it to the other countries in the same region. Data on the number of hospital beds per 1,000 population were taken from the World Bank Development Indicators for 201 countries but, since many of those numbers were not recent, they used a boosted regression tree-based modelling approach to generate current estimates of hospital beds per 1,000 population. Intensive Care Unit Capacity estimates have been derived from systematic reviews. The published data was sparse and yielded a total of 57 data points describing the number of ICU beds per 100 hospital beds across countries belonging to the World Bank’s 4 income strata (LIC, LMIC, UMIC and HIC).
Beyond the five scenarios projected, the Imperial College Excel table allows varying the baseline R0 and the intensity of social distancing measures.
The University of Basel COVID-19 epidemiological simulation tool
While the Imperial College projection results are easy to get, it might be useful to rely on more current and country-specific parameters, especially for hospital and ICU capacity as well as exploring further scenarios and variations for other parameters. The epidemiological simulation tool developed at the University of Basel offers this possibility. I found it to be quite intuitive and easy to use. It comes with preloaded values for many OECD countries and regions, including the current number of reported cases and deaths allowing comparisons between projections and current numbers. At this stage, the only developing country with preloaded data is India (and each Indian state). But the tool can still be customized for other countries. First, it is possible to contribute new data to their platform. Next, the age distribution for most countries is preloaded (set the scenario to Custom and pick the relevant country under Age Structure). Finally, users will have to input a few country-specific parameters (e.g. estimated hospital beds, estimated ICU/ICMU beds, etc.), but this should allow for more precise and tailor-made projections.
The tool doesn’t directly take into account the difference in social mixing patterns across countries, but this could be modeled by varying the initial values in the mitigation scenarios interface.
What the University of Basel tool allows is to introduce seasonality and the impact of different climates. This doesn’t seem to be integrated in the Imperial College projections. A word of caution is important here: while there seems to be emerging evidence that higher temperatures and humidity levels reduce the transmission of the virus (see this paper by Wang et al (2020) for estimates based on a set of cities in China), the evidence is too preliminary to conclude that most developing countries’ experience will be milder because of their weather patterns.
It is important to also note that both the Imperial College results and the University of Basel tool assume no substantive difference in general health and pre-existing conditions prevalence between Chinese and other populations, an assumption that is unlikely to be validated in practice. Moreover, the standard of medical care available varies significantly across the world and tends to be substantially lower in many developing countries, especially among the poor (see Das, Hammer and Leonard, 2008; Kruk et al. 2018). The impact of a lack of adequate care for more severe cases of COVID-19 is difficult to quantify but it is likely to significantly increase overall mortality and could be compounded if the number of cases requiring care leads to a disruption of the health system.
As of now, there are a few other COVID-19 epidemiological simulation tools available, such as this one by Alison Hill and colleagues and this one by Gabriel Goh. I am less familiar with them and they do not seem to include pre-loaded country-specific data such as the important age structure of the population. The Institute for Health Metrics and Evaluation (IHME) just released projections for the United States and its 50 states, based not on an SIR model but focused on modeling the empirically observed COVID-19 population death rate curves. However, it seems to only consider a scenario under which physical distancing measures are maintained.
If you know of other projections and tools, please mention them in the comment section. Suggestions and comments are welcome: this is a very rapidly changing landscape with new information and data coming out on a daily basis.
Let me finish with words of wisdom from the American humorist Evan Esar: "An economist is an expert who will know tomorrow why the things he predicted yesterday didn't happen today." Very little is known yet about COVID-19, a disease that broke out last November. Key parameters such as the transmission rate and the infection fatality ratio are estimated based on sparse data. Several drug and vaccine trials are underway and will hopefully help mitigate the impact of this pandemic, but in the meantime, countries need to prepare, and try to anticipate, to the best of their ability, what is coming their way.
This post benefited from discussions with and suggestions from World Bank colleagues Massimiliano Calli, Gabriel Demombynes, Patrick Hoang-Vu Eozenou, Jed Friedman, Eeshani Kandpal, Aart Kraay and Aaditya Mattoo. All errors are mine.
Great resource & modeling!
Thanks for the good work Damien.
Excellent piece. You have summarized all the available tools. I have been working with a projection to help government of Bangladesh. I am an economist (currently also working as a consultant of WB) with a health economics degree and work experience. I wish I could see yours before. After going through several references, blogs, articles, I realized that even some econometrics tools can be useful for projection as well. I would appreciate it if you see my blog on this issue and give me your comments.
Thanks again for your blog. Very timely.