10
RESEARCH ARTICLE SUMMARY CORONAVIRUS Inferring the effectiveness of government interventions against COVID-19 Jan M. Brauner*, Sören Mindermann*, Mrinank Sharma*, David Johnston, John Salvatier, Tomáš Gavenčiak, Anna B. Stephenson, Gavin Leech, George Altman, Vladimir Mikulik, Alexander John Norman, Joshua Teperowski Monrad, Tamay Besiroglu, Hong Ge, Meghan A. Hartwick, Yee Whye Teh, Leonid Chindelevitch, Yarin Gal, Jan KulveitINTRODUCTION: Governments across the world have implemented a wide range of non- pharmaceutical interventions (NPIs) to miti- gate the spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Given the increasing death toll of the pandemic and the social cost of some interventions, it is critical to understand their relative effective- ness. By considering the effects that inter- ventions had on transmission during the first wave of the outbreak, governments can make more-informed decisions about how to con- trol the pandemic. RATIONALE: Rigorously studying the effective- ness of individual interventions poses consid- erable methodological challenges. Simulation studies can explore scenarios, but they make strong assumptions that may be difficult to validate. Data-driven, cross-country modeling comparing the timing of national interven- tions to the subsequent numbers of cases or deaths is a promising alternative approach. We have collected chronological data on the implementation of several interventions in 41 countries between January and the end of May 2020, using independent double entry by researchers to ensure high data quality. Because countries deployed different com- binations of interventions in different orders and with different outcomes, it is possible to disentangle the effect of individual interven- tions. We estimate the effectiveness of specific interventions with a Bayesian hierarchical model by linking intervention implementa- tion dates to national case and death counts. We partially pool NPI effectiveness to allow for country-specific NPI effects. Our model also accounts for uncertainty in key epidemiological parameters, such as the average delay from infection to death. However, intervention ef- fectiveness estimates should only be used for policy-making if they are robust across a range of modeling choices. We therefore support the results with extensive empirical validation, in- cluding 11 sensitivity analyses under 206 ex- perimental conditions. In these analyses, we show how results change when we vary the data, the epidemiological parameters, or the model structure or when we account for confounders. RESULTS: While exact intervention effectiveness estimates varied with modeling assumptions, broader trends in the results were highly con- sistent across experimental conditions. To de- scribe these trends, we categorized intervention effect sizes as small, moderate, or large, corre- sponding to posterior median reductions in the reproduction number R of <17.5%, between 17.5 and 35%, and >35%, respectively. Across all experimental conditions, all interventions could robustly be placed in one or two of these categories. Closing both schools and universities was consistently highly effective at reducing transmission at the advent of the pandemic. Banning gatherings was effective, with a large effect size for limiting gatherings to 10 people or less, a moderate-to-large effect for 100 people or less, and a small-to-moderate effect for 1000 people or less. Targeted closures of face-to-face businesses with a high risk of infection, such as restaurants, bars, and nightclubs, had a small- to-moderate effect. Closing most nonessential businesses delivering personal services was only somewhat more effective (moderate effect). When these interventions were already in place, issuing a stay-at-home order had only a small additional effect. These results indicate that, by using effective interventions, some countries could control the epidemic while avoiding stay-at-home orders. CONCLUSION: We estimated the effects of non- pharmaceutical interventions on COVID-19 transmission in 41 countries during the first wave of the pandemic. Some interventions were robustly more effective than others. This work may provide insights into which areas of public life require additional interventions to be able to maintain activity despite the pan- demic. However, because of the limitations inherent in observational study designs, our estimates should not be seen as final but rather as a contribution to a diverse body of evidence, alongside other retrospective studies, simulation studies, and experimental trials. RESEARCH Brauner et al., Science 371, 802 (2021) 19 February 2021 1 of 1 The list of author affiliations is available in the full article online. *Corresponding author. Email: [email protected] (J.M.B.); [email protected] (S.M.); mrinank@ robots.ox.ac.uk (M.S.) These authors contributed equally to this work. These authors contributed equally to this work. This is an open-access article distributed under the terms of the Creative Commons Attribution license (https://creative- commons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Cite this article as J. M. Brauner et al., Science 371, eabd9338 (2021). DOI: 10.1126/science.abd9338 READ THE FULL ARTICLE AT https://doi.org/10.1126/science.abd9338 0.0% 17.5% 52.5% Gatherings limited to 1000 people or less Gatherings limited to 100 people or less Gatherings limited to 10 people or less Some businesses closed Most nonessential businesses closed Additional benefit of stay-at-home order on top of above NPIs Schools and universities closed Posterior median reduction in R t 35.0% Median intervention effectiveness estimates across a suite of 206 analyses with different epidemio- logical parameters, data, and modeling assumptions. Bayesian inference using a semimechanistic hierarchical model with observed national case and death data across 41 countries between January and May 2020 is used to infer the effectiveness of several nonpharmaceutical interventions. Although precise effectiveness estimates depend on the assumed data and parameters, there are clear trends across the experimental conditions. Violins show kernel density estimates of the posterior median effectiveness across the sensitivity analysis. R t , instantaneous reproduction number. on August 19, 2021 http://science.sciencemag.org/ Downloaded from

Inferring the effectiveness of government While ... - Science · Brauner et al., Science 371, eabd9338 (2021) 19 February 2021 1of8 1Oxford Applied and Theoretical Machine Learning

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Inferring the effectiveness of government While ... - Science · Brauner et al., Science 371, eabd9338 (2021) 19 February 2021 1of8 1Oxford Applied and Theoretical Machine Learning

RESEARCH ARTICLE SUMMARY◥

CORONAVIRUS

Inferring the effectiveness of governmentinterventions against COVID-19Jan M. Brauner*†, Sören Mindermann*†, Mrinank Sharma*†, David Johnston, John Salvatier,Tomáš Gavenčiak, Anna B. Stephenson, Gavin Leech, George Altman, Vladimir Mikulik,Alexander John Norman, Joshua Teperowski Monrad, Tamay Besiroglu, Hong Ge,Meghan A. Hartwick, Yee Whye Teh, Leonid Chindelevitch‡, Yarin Gal‡, Jan Kulveit‡

INTRODUCTION:Governments across the worldhave implemented a wide range of non-pharmaceutical interventions (NPIs) to miti-gate the spread of severe acute respiratorysyndrome coronavirus 2 (SARS-CoV-2). Giventhe increasing death toll of the pandemic andthe social cost of some interventions, it iscritical to understand their relative effective-ness. By considering the effects that inter-ventions had on transmission during the firstwave of the outbreak, governments can makemore-informed decisions about how to con-trol the pandemic.

RATIONALE: Rigorously studying the effective-ness of individual interventions poses consid-erable methodological challenges. Simulationstudies can explore scenarios, but they makestrong assumptions that may be difficult tovalidate. Data-driven, cross-country modelingcomparing the timing of national interven-tions to the subsequent numbers of cases ordeaths is a promising alternative approach.We have collected chronological data on the

implementation of several interventions in41 countries between January and the endof May 2020, using independent double entryby researchers to ensure high data quality.Because countries deployed different com-binations of interventions in different ordersand with different outcomes, it is possible todisentangle the effect of individual interven-tions. We estimate the effectiveness of specificinterventions with a Bayesian hierarchicalmodel by linking intervention implementa-tion dates to national case and death counts.We partially pool NPI effectiveness to allowfor country-specific NPI effects. Our model alsoaccounts for uncertainty in key epidemiologicalparameters, such as the average delay frominfection to death. However, intervention ef-fectiveness estimates should only be used forpolicy-making if they are robust across a rangeof modeling choices. We therefore support theresults with extensive empirical validation, in-cluding 11 sensitivity analyses under 206 ex-perimental conditions. In these analyses, weshow how results change when we vary the

data, the epidemiological parameters, or themodel structure or when we account forconfounders.

RESULTS: While exact intervention effectivenessestimates varied with modeling assumptions,broader trends in the results were highly con-sistent across experimental conditions. To de-scribe these trends, we categorized interventioneffect sizes as small, moderate, or large, corre-sponding to posterior median reductions inthe reproduction numberR of <17.5%, between17.5 and 35%, and >35%, respectively. Acrossall experimental conditions, all interventionscould robustly be placed in one or two of thesecategories. Closing both schools and universitieswas consistently highly effective at reducingtransmission at the advent of the pandemic.Banning gatherings was effective, with a largeeffect size for limiting gatherings to 10 peopleor less, amoderate-to-large effect for 100 peopleor less, and a small-to-moderate effect for 1000people or less. Targeted closures of face-to-facebusinesses with a high risk of infection, such asrestaurants, bars, and nightclubs, had a small-to-moderate effect. Closing most nonessentialbusinesses delivering personal services was onlysomewhat more effective (moderate effect).When these interventions were already inplace, issuing a stay-at-home order had onlya small additional effect. These results indicatethat, by using effective interventions, somecountries could control the epidemic whileavoiding stay-at-home orders.

CONCLUSION:We estimated the effects of non-pharmaceutical interventions on COVID-19transmission in 41 countries during the firstwave of the pandemic. Some interventionswere robustly more effective than others. Thiswork may provide insights into which areas ofpublic life require additional interventions tobe able to maintain activity despite the pan-demic. However, because of the limitationsinherent in observational study designs, ourestimates should not be seen as final butrather as a contribution to a diverse body ofevidence, alongside other retrospective studies,simulation studies, and experimental trials.▪

RESEARCH

Brauner et al., Science 371, 802 (2021) 19 February 2021 1 of 1

The list of author affiliations is available in the full article online.*Corresponding author. Email: [email protected](J.M.B.); [email protected] (S.M.); [email protected] (M.S.)†These authors contributed equally to this work.‡These authors contributed equally to this work.This is an open-access article distributed under the terms ofthe Creative Commons Attribution license (https://creative-commons.org/licenses/by/4.0/), which permits unrestricteduse, distribution, and reproduction in any medium, providedthe original work is properly cited.Cite this article as J. M. Brauner et al., Science 371,eabd9338 (2021). DOI: 10.1126/science.abd9338

READ THE FULL ARTICLE AThttps://doi.org/10.1126/science.abd9338

0.0% 17.5% 52.5%

Gatherings limited to 1000 people or less

Gatherings limited to 100 people or less

Gatherings limited to 10 people or less

Some businesses closed

Most nonessential businesses closed

Additional benefit of stay-at-home order on top of above NPIs

Schools and universities closed

Posterior median reduction in Rt35.0%

Median intervention effectiveness estimates across a suite of 206 analyses with different epidemio-logical parameters, data, and modeling assumptions. Bayesian inference using a semimechanistichierarchical model with observed national case and death data across 41 countries between January and May2020 is used to infer the effectiveness of several nonpharmaceutical interventions. Although preciseeffectiveness estimates depend on the assumed data and parameters, there are clear trends across theexperimental conditions. Violins show kernel density estimates of the posterior median effectiveness acrossthe sensitivity analysis. Rt, instantaneous reproduction number.

on August 19, 2021

http://science.sciencem

ag.org/D

ownloaded from

Page 2: Inferring the effectiveness of government While ... - Science · Brauner et al., Science 371, eabd9338 (2021) 19 February 2021 1of8 1Oxford Applied and Theoretical Machine Learning

RESEARCH ARTICLE◥

CORONAVIRUS

Inferring the effectiveness of governmentinterventions against COVID-19Jan M. Brauner1,2*†, Sören Mindermann1*†, Mrinank Sharma2,3,4*†, David Johnston5,6, John Salvatier6,Tomáš Gavenčiak7, Anna B. Stephenson8, Gavin Leech9, George Altman10, Vladimir Mikulik11,Alexander John Norman12, Joshua Teperowski Monrad2,13,14, Tamay Besiroglu15, Hong Ge16,Meghan A. Hartwick17, Yee Whye Teh3, Leonid Chindelevitch18,19‡, Yarin Gal1‡, Jan Kulveit2‡

Governments are attempting to control the COVID-19 pandemic with nonpharmaceutical interventions(NPIs). However, the effectiveness of different NPIs at reducing transmission is poorly understood. Wegathered chronological data on the implementation of NPIs for several European and non-Europeancountries between January and the end of May 2020. We estimated the effectiveness of these NPIs,which range from limiting gathering sizes and closing businesses or educational institutions tostay-at-home orders. To do so, we used a Bayesian hierarchical model that links NPI implementationdates to national case and death counts and supported the results with extensive empirical validation.Closing all educational institutions, limiting gatherings to 10 people or less, and closing face-to-facebusinesses each reduced transmission considerably. The additional effect of stay-at-home orders wascomparatively small.

Worldwide, governments have mobi-lized resources to fight the COVID-19pandemic. A wide range of non-pharmaceutical interventions (NPIs)has been deployed, including stay-at-

home orders and the closure of all nonessentialbusinesses. Recent analyses show that theselarge-scale NPIs were jointly effective at reduc-ing the virus’s effective reproduction numberRt (1), but it is still largely unknown howeffective individual NPIs were. As more data

become available, we can move beyond esti-mating the combined effect of a bundle ofNPIs and begin to understand the effects ofindividual interventions. This can help govern-ments efficiently control the epidemic, byfocusing on the most effective NPIs to easethe burden put on the population.A promising way to estimate NPI effective-

ness is data-driven, cross-country modeling:inferring effectiveness by relating the NPIsimplemented in different countries to thecourse of the epidemic in these countries. Todisentangle the effects of individual NPIs, weneed to leverage data from multiple coun-tries with diverse sets of interventions in place.Previous data-driven studies (table S8) esti-mate effectiveness for individual countries(2–4) or NPIs, although some exceptions doexist [(1, 5–8); summarized in table S7]. Incontrast, we evaluated the impact of severalNPIs on the epidemic’s growth in 34 Europeanand 7 non-European countries. If all countriesimplemented the same set of NPIs on the sameday, the individual effect of each NPI wouldbe unidentifiable. However, the COVID-19 re-sponse was far less coordinated: Countriesimplemented different sets of NPIs at differ-ent times and in different orders (Fig. 1).Evenwith diverse data frommany countries,

estimating NPI effects remains a challengingtask. To begin with, models are based on un-certain epidemiological parameters; our NPIeffectiveness study incorporates some of thisuncertainty directly into the model. Further-more, the data are retrospective and observa-tional, meaning that unobserved factors couldconfound the results. Also, NPI effectivenessestimates can be highly sensitive to arbitrary

modeling decisions, as shown by two recentreplication studies (9, 10). And finally, large-scale public NPI datasets suffer from frequentinconsistencies (11) and missing data (12).Hence, the data and the model must be care-fully validated if they are to be used to guidepolicy decisions. We have collected a largepublic dataset on NPI implementation datesthat has been validated by independent doubleentry, and we have extensively validated oureffectiveness estimates. This validation of dataand model is a crucial but often absent or in-complete element of COVID-19 NPI effective-ness studies (10).Our results provide insight on the amount

of COVID-19 transmission associated with var-ious areas and activities of public life, such asgatherings of different sizes. Therefore, theymay inform the packages of interventions thatcountries implement to control transmissionin current and future waves of infections. How-ever, we need to be careful when interpretingthis study’s results. We only analyzed the effectNPIs had between January and the end of May2020, and NPI effectiveness may change overtime as circumstances change. Lifting an NPIdoes not imply that transmission will return toits original level, and our window of analysisdoes not include relaxation of NPIs. These andother limitations are detailed in the Discussionsection.

Cross-country NPI effectiveness modeling

Weanalyzed the effects of seven commonly usedNPIs between 22 January and 30May 2020. AllNPIs aimed to reduce the number of contactswithin the population (Table 1). If a countrylifted an NPI before 30 May, the window ofanalysis for that country terminates on theday of the lifting (seeMaterials andmethods).To ensure high data quality, all NPI data wereindependently entered by two of the authors(independent double entry) using primarysources and then manually compared withseveral public datasets. Data on confirmedCOVID-19 cases and deaths were taken fromthe Johns Hopkins Center for Systems Scienceand Engineering (CSSE) COVID-19 Dataset(13). The data used in this study, includingsources, are available online (14).We estimated the effectiveness of NPIs with

a Bayesian hierarchical model. We used caseand death data from each country to infer thenumber of new infections at each point intime, which is itself used to infer the (instan-taneous) reproduction number Rt over time.NPI effects were then estimated by relatingthe daily reproduction numbers to the activeNPIs, across all days and countries. This rela-tively simple, data-driven approach allowed usto sidestep assumptions about contact patternsand intensity, infectiousness of different agegroups, and so forth that are typically requiredin modeling studies. This approach also

RESEARCH

Brauner et al., Science 371, eabd9338 (2021) 19 February 2021 1 of 8

1Oxford Applied and Theoretical Machine Learning (OATML)Group, Department of Computer Science, University ofOxford, Oxford, UK. 2Future of Humanity Institute, Universityof Oxford, Oxford, UK. 3Department of Statistics, Universityof Oxford, Oxford, UK. 4Department of Engineering Science,University of Oxford, Oxford, UK. 5College of Engineering andComputer Science, Australian National University, Canberra,Australia. 6Quantified Uncertainty Research Institute, SanFrancisco, CA, USA. 7Independent scholar, Prague, CzechRepublic. 8Harvard John A. Paulson School of Engineeringand Applied Sciences, Harvard University, Cambridge, MA,USA. 9School of Computer Science, University of Bristol,Bristol, UK. 10School of Medical Sciences, University ofManchester, Manchester, UK. 11Independent scholar, London,UK. 12Mathematical, Physical and Life Sciences (MPLS)Doctoral Training Centre, University of Oxford, Oxford, UK.13Faculty of Public Health and Policy, London School ofHygiene and Tropical Medicine, London, UK. 14Department ofHealth Policy, London School of Economics and PoliticalScience, London, UK. 15Faculty of Economics, University ofCambridge, Cambridge, UK. 16Engineering Department,University of Cambridge, Cambridge, UK. 17Tufts Initiative forthe Forecasting and Modeling of Infectious Diseases, TuftsUniversity, Boston, MA, USA. 18Medical Research Council(MRC) Centre for Global Infectious Disease Analysis, Schoolof Public Health, Imperial College London, London, UK.19Abdul Latif Jameel Institute for Disease and EmergencyAnalytics (J-IDEA), School of Public Health, Imperial CollegeLondon, London, UK.*Corresponding author. Email: [email protected] (J.M.B.);[email protected] (S.M.); [email protected] (M.S.) †These authors contributed equally to this work.‡These authors contributed equally to this work.

on August 19, 2021

http://science.sciencem

ag.org/D

ownloaded from

Page 3: Inferring the effectiveness of government While ... - Science · Brauner et al., Science 371, eabd9338 (2021) 19 February 2021 1of8 1Oxford Applied and Theoretical Machine Learning

allowed us to directly model many sourcesof uncertainty, such as uncertain epidemio-logical parameters, differences in NPI effec-tiveness between countries, unknown changesin testing and infection fatality rates, and theeffect of unobserved influences on Rt. Thecode is available online (14).

Effectiveness of individual NPIs

Our model enabled us to estimate the indi-vidual effectiveness of each NPI, expressed asa percentage reduction in Rt. We quantifieduncertainty with Bayesian prediction inter-vals, which are wider than standard credibleintervals. Bayesian prediction intervals reflectdifferences in NPI effectiveness across coun-

tries among several other sources of uncer-tainty. They are analogous to the standarddeviation of the effectiveness across coun-tries rather than the standard error of themean effectiveness. Under the default modelsettings, the percentage reduction in Rt (with95% prediction interval; Fig. 2) associated witheach NPI was as follows: limiting gatherings to1000 people or less: 23% (0 to 40%); limit-ing gatherings to 100 people or less: 34%(12 to 52%); limiting gatherings to 10 peopleor less: 42% (17 to 60%); closing some high-risk face-to-face businesses: 18% (−8 to 40%);closing most nonessential face-to-face busi-nesses: 27% (−3 to 49%); closing both schoolsand universities in conjunction: 38% (16 to

54%); and issuing stay-at-home orders (addi-tional effect on top of all other NPIs): 13%(−5 to 31%). Note that we were not able torobustly disentangle the individual effectsof closing only schools or only universities,because these NPIs were implemented onthe same day or in close succession in mostcountries [except Iceland and Sweden, whereonly universities were closed (see also fig.S21)]. We thus reported “schools and uni-versities closed” as one NPI.Some NPIs frequently co-occurred, i.e., were

partly collinear. However, we were able toisolate the effects of individual NPIs, becausethe collinearity was imperfect and our data-set large. For every pair of NPIs, we observed

Brauner et al., Science 371, eabd9338 (2021) 19 February 2021 2 of 8

Fig. 1. Timing of NPI implementations in early 2020. Crossed-out icons signify when an NPI was lifted. Detailed definitions of the NPIs are given in Table 1.

RESEARCH | RESEARCH ARTICLEon A

ugust 19, 2021

http://science.sciencemag.org/

Dow

nloaded from

Page 4: Inferring the effectiveness of government While ... - Science · Brauner et al., Science 371, eabd9338 (2021) 19 February 2021 1of8 1Oxford Applied and Theoretical Machine Learning

one without the other for 504 days across allcountries (country-days) on average (table S5).Theminimumnumber of country-days for anyNPI pair is 148 (for limiting gatherings to 1000or 100 attendees). Additionally, under exces-sive collinearity, and insufficient data to over-come it, individual effectiveness estimateswouldbe highly sensitive to variations in the data andmodel parameters (15). Indeed, high sensitivityprevented Flaxman et al. (1), who had a smallerdataset, from disentangling NPI effects (9). Incontrast, our effectiveness estimates are sub-stantially less sensitive (see below). Finally, theposterior correlations between the effectivenessestimates are weak, further suggesting manage-able collinearity (fig. S22).

Effectiveness of NPI combinations

Although the correlations between the indi-vidual estimates were weak, we took theminto account when evaluating combined NPIeffectiveness. For example, if two NPIs fre-quently co-occur, theremay bemore certaintyabout the combined effectiveness than aboutthe effectiveness of each NPI individually.Figure 3 shows the combined effectiveness ofthe sets of NPIs that are most common in ourdata. In combination, the NPIs in this studyreduced Rt by 77% (67 to 85%). Across coun-tries, the mean Rt without any NPIs (i.e., theR0) was 3.3 (table S4). Starting from this num-ber, the estimated Rt likely could have beenbrought below 1 by closing schools and univer-sities, closing high-risk businesses, and limitinggathering sizes to atmost 10 people. Readers can

interactively explore the effects of sets of NPIswith our online mitigation calculator (16). Acomma-separated value file containing thejoint effectiveness of all NPI combinations isavailable online (14).

Sensitivity and validation

We performed a range of validation and sen-sitivity experiments (figs. S2 to S19). First, weanalyzed how the model extrapolated to coun-tries that did not contribute data for fitting themodel, and we found that it could generatecalibrated forecasts for up to 2 months, withuncertainty increasing over time. Multiplesensitivity analyses showed how the resultschanged when we modified the priors overepidemiological parameters, excluded countriesfrom the dataset, used only deaths or confirmedcases as observations, varied the data prepro-cessing, and more. Finally, we tested our keyassumptions by showing results for severalalternative models [structural sensitivity (10)]and examined possible confounding of our es-timates by unobserved factors influencing Rt.In total, we considered NPI effectiveness un-der 206 alternative experimental conditions(Fig. 4A). Compared with the results obtainedunder our default settings (Figs. 2 and 3),median NPI effectiveness varied under alter-native plausible experimental conditions. How-ever, the trends in the results are robust, andsomeNPIs outperformed others under all testedconditions. Although we tested large rangesof plausible values, our experiments did notinclude every possible source of uncertainty.

We categorized NPI effects into small, mod-erate, and large, whichwe define as a posteriormedian reduction inRt of <17.5%, between 17.5and 35%, and >35%, respectively (vertical linesin Fig. 4). Four of the NPIs fell into the samecategory across a large fraction of experimen-tal conditions: closing both schools and uni-versities was associated with a large effect in96% of experimental conditions, and limitinggatherings to 10 people or less had a large ef-fect in 99% of conditions. Closing most non-essential businesses had a moderate effect in98% of conditions. Issuing stay-at-home orders(that is, in addition to the other NPIs) fell intothe “small effect” category in 96% of experi-mental conditions. Three NPIs fell less clearlyinto one category: Limiting gatherings to 1000people or less had a small-to-moderate effect(moderate in 81% of conditions) while limit-ing gatherings to 100 people or less had amoderate-to-large effect (moderate in 66% ofconditions). Finally, closing some high-riskbusinesses, including bars, restaurants, andnightclubs, had a small-to-moderate effect(moderate in58%of conditions). Limitinggather-ings to 1000 people or less was the NPI with thehighest variation in median effectiveness acrossthe experimental conditions (Fig. 4A),whichmayreflect this NPI’s partial collinearity with limitinggatherings to 100 people or less.Aggregating all sensitivity analyses can hide

sensitivity to specific assumptions. We displaythe median NPI effects in four categories ofsensitivity analyses (Fig. 4, B to E), and eachindividual sensitivity analysis is shown in the

Brauner et al., Science 371, eabd9338 (2021) 19 February 2021 3 of 8

Table 1. NPIs included in the study.

NPI Description

Gatherings limited to1000 people or less

A country has set a size limit on gatherings. The limit is at most 1000 people(often less), and gatherings above the maximum size are disallowed.

For example, a ban on gatherings of 500 people or more would be classified as“gatherings limited to 1000 or less,” but a ban on gatherings of 2000 people or more would not.

.. .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .

Gatherings limited to100 people or less

A country has set a size limit on gatherings. The limit is at most 100 people (often less)... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .

Gatherings limited to10 people or less

A country has set a size limit on gatherings. The limit is at most 10 people (often less)... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .

Some businesses closed

A country has specified a few kinds of face-to-face businesses that areconsidered high risk and need to suspend operations (blacklist).

Common examples are restaurants, bars, nightclubs, cinemas, and gyms.By default, businesses are not suspended.

.. .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .

Most nonessentialbusinesses closed

A country has suspended the operations of many face-to-face businesses.By default, face-to-face businesses are suspended unless they are designated as essential (whitelist).

.. .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .

Schools closed A country has closed most or all schools... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .

Universities closed A country has closed most or all universities and higher-education facilities... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .

Stay-at-home order

An order for the general public to stay at home has been issued. This is mandatory, not just arecommendation. Exemptions are usually granted for certain purposes (such as shopping,exercise, or going to work) or, more rarely, for certain times of the day. Whenever countries

in our dataset introduced stay-at-home orders, they essentially always also implemented, or alreadyhad in place, all other NPIs listed in this table. All these are encoded as distinct NPIs in the data. In our results,

we thus estimate the additional effect of a stay-at-home order on top of all other NPIs... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .. ... ... .. ... .. ... ... .. ... ... .

RESEARCH | RESEARCH ARTICLEon A

ugust 19, 2021

http://science.sciencemag.org/

Dow

nloaded from

Page 5: Inferring the effectiveness of government While ... - Science · Brauner et al., Science 371, eabd9338 (2021) 19 February 2021 1of8 1Oxford Applied and Theoretical Machine Learning

supplementary materials. The trends in theresults are also stable within these categories.

Discussion

We used a data-driven approach to estimatethe effects that seven nonpharmaceutical in-terventions had on COVID-19 transmission in41 countries between January and the end ofMay 2020. We found that several NPIs wereassociated with a clear reduction in Rt, in linewith mounting evidence that NPIs are effec-tive at mitigating and suppressing outbreaksof COVID-19. Furthermore, our results indicatethat some NPIs outperformed others. Whilethe exact effectiveness estimates vary withmodeling assumptions, the broad conclusionsdiscussed below are largely robust across206 experimental conditions in 11 sensitivityanalyses.Business closures and gathering bans both

seem to have been effective at reducingCOVID-19 transmission. Closing most non-essential face-to-face businesseswas only some-what more effective than targeted closures,which only affected businesses with high in-fection risk, such as bars, restaurants, andnightclubs (see also Table 1). Therefore, tar-geted business closures can be a promisingpolicy option in some circumstances. Limit-ing gatherings to 10 people or less was moreeffective than limits of up to 100 or 1000people and had amore robust effect estimate.Note that our estimates are derived fromdatabetween January and May 2020, a periodwhen most gatherings were likely indoorsowing to the weather.Whenever countries in our dataset intro-

duced stay-at-home orders, they essentiallyalways also implemented, or already had inplace, all other NPIs in this study. We ac-counted for these other NPIs separately andisolated the effect of ordering the populationto stay at home, in addition to the effect of allother NPIs. In accordance with other studiesthat took this approach (2, 6), we found thatissuing a stay-at-home order had a small effectwhen a country had already closed educationalinstitutions and nonessential businesses andhadbanned gatherings. In contrast, Flaxman et al.(1) and Hsiang et al. (3) included the effect ofseveralNPIs in the effectiveness of their stay-at-home order (or “lockdown”) NPIs and accord-ingly found a large effect for this NPI. Ourfinding suggests that some countries may havebeen able to reduce Rt to <1 without a stay-at-home order (Fig. 3) by issuing other NPIs.We found a large effect for closing both

schools and universities in conjunction, whichwas remarkably robust across different modelstructures, variations in the data, and epide-miological assumptions (Fig. 4). This effectremained robust when controlling for NPIsexcluded fromour study (fig. S9). Our approachcannot distinguish direct effects on transmission

Brauner et al., Science 371, eabd9338 (2021) 19 February 2021 4 of 8

Fig. 2. NPI effectiveness under default model settings. Posterior percentage reductions in Rt withmedian, 50%, and 95% prediction intervals shown. Prediction intervals reflect many sources of uncertainty,including NPI effectiveness varying by country and uncertainty in epidemiological parameters. A negative1% reduction refers to a 1% increase in Rt. “Schools and universities closed” shows the joint effect of closingboth schools and universities; the individual effect of closing just one will be smaller (see text). Cumulativeeffects are shown for hierarchical NPIs (gathering bans and business closures), that is, the result for “Mostnonessential businesses closed” shows the cumulative effect of two NPIs with separate parameters andicons—closing some (high-risk) businesses, and additionally closing most remaining (non-high-risk butnonessential) businesses given that some businesses are already closed.

A

B

Fig. 3. Combined NPI effectiveness for the 15 most commonly implemented sets of NPIs in our data.Black and gray bars denote 50% and 95% Bayesian prediction intervals, respectively. (A) Predicted Rt afterimplementation of each set of NPIs, assuming R0 = 3.3. (B) Maximum R0 that can be reduced to Rt below 1 bycommon sets of NPIs. Readers can interactively explore the effects of all sets of NPIs, while setting R0 andadjusting NPI effectiveness to local circumstances, with our online mitigation calculator (16).

RESEARCH | RESEARCH ARTICLEon A

ugust 19, 2021

http://science.sciencemag.org/

Dow

nloaded from

Page 6: Inferring the effectiveness of government While ... - Science · Brauner et al., Science 371, eabd9338 (2021) 19 February 2021 1of8 1Oxford Applied and Theoretical Machine Learning

in schools and universities from indirect effects,such as the general population behaving morecautiously after school closures signaled thegravity of the pandemic. Additionally, becauseschool and university closures were imple-mented on the same day or in close successionin most of the countries we studied, our ap-proach cannot distinguish their individual ef-fects (fig. S21). This limitation likely also holdsfor other observational studies that do not in-clude data on university closures and estimateonly the effect of school closures (1–3, 5–8).Furthermore, our study does not provide evi-dence on the effect of closing preschools andnurseries.

Previous evidence on the role of pupils andstudents in transmission is mixed. Althoughinfected young people (~12 to 25 years of age)are often asymptomatic, they appear to shedsimilar amounts of virus as older people(17, 18) and might therefore infect higher-risk individuals. Early data suggested thatchildren and young adults had a notably lowerobserved incidence rate than older adults—whether this was due to school and univer-sity closures remains unknown (19–22). Incontrast, the recent resurgence of cases inEuropean countries has been concentratedin the age group corresponding to secondaryschool and higher education (especially the

latter) and is now spreading to older agegroups as well as primary school–aged children(23, 24). Primary schools may be generally lessaffected than secondary schools (20, 25–28),perhaps partly because children under the ageof 12 are less susceptible to SARS-CoV-2 (29).Our study has several limitations. (i) NPI

effectiveness may depend on the context ofimplementation, such as the presence of otherNPIs, country demographics, and specific im-plementation details. Our results thus need tobe interpreted as indicating the effectivenessin the contexts in which the NPI was imple-mented in our data (10). For example, in acountry with a comparatively old population,the effectiveness of closing schools and uni-versities would likely have been on the lowerend of our prediction interval. Expert judg-ment should thus be used to adjust our esti-mates to local circumstances. (ii) Rt may havebeen reduced by unobserved NPIs or volun-tary behavior changes such as mask-wearing.To investigate whether the effect of these po-tential confounders could be falsely attributedto the observed NPIs, we performed severaladditional analyses and found that our resultsare stable to a range of unobserved factors(fig. S9). However, this sensitivity check can-not provide certainty, and investigating therole of unobserved factors is an importanttopic to explore further. (iii) Our results can-not be used without qualification to predictthe effect of lifting NPIs. For example, closingschools and universities in conjunction seemsto have greatly reduced transmission, but thisdoes not mean that reopening them will nec-essarily cause infections to soar. Educationalinstitutions can implement safety measures,such as reduced class sizes, as they reopen.However, the nearly 40,000 confirmed casesassociated with universities in the UnitedKingdom since they reopened in September2020 show that educational institutions maystill play a large role in transmission, despitesafetymeasures (30). (iv)We do not have dataon some promising interventions, such as test-ing, tracing, and case isolation. These inter-ventions could become an important part of acost-effective epidemic response (31), but wedid not include them because it is difficult toobtain comprehensive data on their imple-mentation. In addition, although the data aremore readily available, it is difficult to estimatethe effect of mask-wearing in public spacesbecause therewas limited public life as a resultof other NPIs. We discuss further limitationsin supplementary text section E.Although our work focused on estimating the

impact of NPIs on the reproduction number Rt,the ultimate goal of governments may be toreduce the incidence, prevalence, and excessmortality of COVID-19. For this, controllingRt is essential, but the contribution of NPIstoward these goals may also be mediated by

Brauner et al., Science 371, eabd9338 (2021) 19 February 2021 5 of 8

A

B C D E

Fig. 4. Median NPI effectiveness across the sensitivity analyses. (A) Median NPI effectiveness(reduction in Rt) when varying different components of the model or the data in 206 experimental conditions.Results are displayed as violin plots, using kernel density estimation to create the distributions. Insidethe violins, the box plots show median and interquartile range. The vertical lines mark 0, 17.5, and 35%(see text). (B to E) Categorized sensitivity analyses. (B) Sensitivity to model structure. Using only cases oronly deaths as observations (two experimental conditions; fig. S7); varying the model structure (threeconditions; fig. S8, left). (C) Sensitivity to data and preprocessing. Leaving out countries from the dataset(42 conditions; figs. S5 and S21); varying the threshold below which cases and deaths are masked (eightconditions; fig. S13); sensitivity to correcting for undocumented cases and to country-level differences in caseascertainment (two conditions; fig. S6). (D) Sensitivity to epidemiological parameters. Jointly varying themeans of the priors over the means of the generation interval, the infection-to-case-confirmation delay, andthe infection-to-death delay (125 conditions; fig. S10); varying the prior over R0 (four conditions; fig. S11);varying the prior over NPI effect parameters (three conditions; fig. S11); varying the prior over the degree towhich NPI effects vary across countries (three conditions; fig. S12). (E) Sensitivity to unobserved factorsinfluencing Rt. Excluding observed NPIs one at a time (eight conditions; fig. S9); controlling for additionalNPIs from a different dataset (six conditions; fig. S9).

RESEARCH | RESEARCH ARTICLEon A

ugust 19, 2021

http://science.sciencemag.org/

Dow

nloaded from

Page 7: Inferring the effectiveness of government While ... - Science · Brauner et al., Science 371, eabd9338 (2021) 19 February 2021 1of8 1Oxford Applied and Theoretical Machine Learning

other factors, such as their duration and timing(32), periodicity and adherence (33, 34), andsuccessful containment (35). While each ofthese factors addresses transmission withinindividual countries, it can be crucial to alsosynchronize NPIs between countries, giventhat cases can be imported (36).Many governments around the world seek

to keep Rt below 1 while minimizing thesocial and economic costs of their interven-tions. Our work offers insights into whichareas of public life are most in need of viruscontainment measures so that activities cancontinue as the pandemic develops; however,our estimates should not be taken as the finalword on NPI effectiveness.

Materials and methodsDataset

We analyzed the effects of NPIs (Table 1) in41 countries (37) (Fig. 1). We recorded NPIimplementations when the measures wereimplemented nationally or in most regionsof a country (affecting at least three-fourthsof the population). We recorded only manda-

tory restrictions, not recommendations. Sup-plementary text section G details how edgecases in the data collection were handled. Foreach country, the window of analysis startson 22 January and ends either after the firstlifting of an NPI or on 30May 2020, whichevercame first. The reason to end the analysis afterthe first major reopening (38) was to avoid adistribution shift. For example, when schoolsreopened, it was often with safety measures,such as smaller class sizes and distancing rules.It is therefore expected that contact patterns inschools will have been different before schoolclosure comparedwith after reopening.Model-ing this difference explicitly is left for futurework. Data on confirmed COVID-19 cases anddeaths were taken from the Johns HopkinsCSSE COVID-19 Dataset (13). The data usedin this study, including sources, are availableonline (14).

Data collection

We collected data on the start and end datesof NPI implementations, from the start of thepandemic until 30 May 2020. Before collect-

ing the data, we experimented with severalpublic NPI datasets, finding that they werenot complete enough for our modeling andcontained incorrect dates (39). By focusingon a smaller set of countries and NPIs thanthese datasets, we were able to enforce strongquality controls: We used independent doubleentry and manually compared our data withpublic datasets for cross-checking.First, two authors independently researched

each country and entered the NPI data intoseparate spreadsheets. The researchers manu-ally researched the dates using internetsearches: There was no automatic componentin the data-gathering process. The averagetime spent researching each country was1.5 hours per researcher. Next, the researchersindependently compared their entries againsttwo public datasets, the Epidemic ForecastingGlobal NPI (EFGNPI) Database (40) and theOxfordCOVID-19GovernmentResponseTracker(41), and, if there were conflicts, visited all pri-mary sources to resolve the conflicts. After that,each country andNPIwas again independentlyentered by one to three paid contractors, whowere provided with a detailed description ofthe NPIs and asked to include primary sourceswith their data. A researcher then resolved anyconflicts between this data and one (but notboth) of the spreadsheets. Finally, the two inde-pendent spreadsheets were combined and allconflicts resolved by a researcher. The finaldataset contains primary sources (govern-ment websites and/or media articles) foreach entry.

Data preprocessing

When the case count is small, a large fractionof casesmay be imported from other countriesand the testing regime may change rapidly.To prevent this from biasing our model, weneglected case numbers before a country hadreached 100 confirmed cases and fatality num-bers before a country had reached 10 deaths.We included these thresholds in our sensitivityanalysis (fig. S13).

Brief model description

In this section, we give a short summary of themodel (Fig. 5). The detailed model descriptionis given in supplementary text section A.Briefly, our model uses case and death datafrom each country to “backward” infer thenumber of new infections at each point intime, which is itself used to infer the reproduc-tion numbers. NPI effects are then estimatedby relating the daily reproduction numbers tothe active NPIs, across all days and countries.This relatively simple, data-driven approachallowed us to sidestep assumptions about con-tact patterns and intensity, infectiousness ofdifferent age groups, and so forth that aretypically required in modeling studies. Codeis available online (14).

Brauner et al., Science 371, eabd9338 (2021) 19 February 2021 6 of 8

Fig. 5. Model overview. Unshaded, white nodes are observed. From bottom to top: The mean effectparameter of NPI i is ai, and the country-specific effect parameter is ai,c. On each day t, a country’s dailyreproduction number Rt,c depends on the country’s basic reproduction number R0,c and the active NPIs.The active NPIs are encoded by xi,t,c, which is 1 if NPI i is active in country c at time t, and 0 otherwise. Rt,c istransformed into the daily growth rate gt,c using the generation interval parameters and subsequently is usedto compute the new infections NðCÞ

t;c and NðDÞt;c that will subsequently become confirmed cases and deaths,

respectively. Finally, the expected numbers of daily confirmed cases yðCÞt;c and deaths yðDÞt;c are computed using

discrete convolutions of Nð:Þt;c with the relevant delay distributions. Our model uses both case and death

data; it splits all nodes above the daily growth rate gt,c into separate branches for deaths and confirmedcases. We account for uncertainty in the generation interval, infection-to–case confirmation delay, and theinfection-to-death delay by placing priors over the parameters of these distributions.

RESEARCH | RESEARCH ARTICLEon A

ugust 19, 2021

http://science.sciencemag.org/

Dow

nloaded from

Page 8: Inferring the effectiveness of government While ... - Science · Brauner et al., Science 371, eabd9338 (2021) 19 February 2021 1of8 1Oxford Applied and Theoretical Machine Learning

Our model builds on the semimechanisticBayesian hierarchical model of Flaxman et al.(1), with several additions. First, we allow ourmodel to observe both case and death data.This increases the amount of data from whichwe can extract NPI effects, reduces distinctbiases in case and death reporting, and re-duces the bias from including only countrieswith many deaths. Second, since epidemiolog-ical parameters are only known with uncer-tainty, we place priors over them, followingrecent recommended practice (42). Third, aswe do not aim to infer the total number ofCOVID-19 infections, we can avoid assuminga specific infection fatality rate (IFR) or ascer-tainment rate (rate of testing). Fourth, weallow the effects of all NPIs to vary acrosscountries, reflecting differences in NPI im-plementation and adherence.We now describe the model by going

through Fig. 5 from bottom to top. The growthof the epidemic is determined by the time- andcountry-specific reproduction number Rt,c,which depends on (i) the (unobserved) basicreproduction number in country c, R0,c, and(ii) the active NPIs at time t. R0,c accounts forall time-invariant factors that affect transmis-sion in country c, such as differences in demo-graphics, population density, culture, and healthsystems (43).Following Flaxman et al. and others (1, 6, 8),

each NPI is assumed to independently affectRt,c as a multiplicative factor

Rt;c ¼ R0;c

YIi¼1

expð�ai;cxi;t;cÞ

where xi,t,c = 1 indicates that NPI i is active incountry c on day t (xi,t,c = 0 otherwise), I is thenumber of NPIs, and ai,c is the effect param-eter for NPI i in country c. The multiplicativeeffect encodes the plausible assumption thatNPIs have a smaller absolute effect whenRt,c isalready low.We assume that the effect of each NPI on

Rt,c is stable across time but can vary acrosscountries to some degree. Concretely, the ef-fect parameter of intervention i in country c isdefined as ai,c = ai + zi,c, where ai representsthe mean effect parameter, and zi;ceNð0; s2i Þ.The variance si

2 corresponds to the degree ofcross-country variation in the effectivenessof NPI i and is inferred from the data. Thispartial pooling of NPI effect parameters min-imizes bias from country-specific sourceswhile also reflecting that NPI effectivenessis likely different across countries. We definethe effectiveness of NPI i as the percentagereduction in Rt associated with NPI i acrosscountries. This effectiveness, displayed in Figs.2 to 4, is computed as 1 – exp(–(ai + zi)), whereagain zieNð0; s2i Þ and si

2 is drawn from itsposterior. We place an asymmetric Laplaceprior on ai that allows for both positive and

negative effects but places 80% of its proba-bility mass on positive effects, reflecting thatNPIs are more likely to reduce Rt,c than toincrease it.In the early phase of an epidemic, the num-

ber of new daily infections grows exponen-tially. During exponential growth, there is aone-to-one correspondence between the dailygrowth rate and Rt,c (44). The correspondencedepends on the generation interval (the timebetween successive infections in a chain oftransmission), which we assume to have agamma distribution. The prior on the meangeneration interval has a mean of 5.06 days,derived from a meta-analysis (45).We model the daily new infection count

separately for confirmed cases and deaths,representing those infections that are sub-sequently reported and those that are sub-sequently fatal. However, both infectionnumbers are assumed to grow at the samedaily rate in expectation, allowing the use ofboth data sources to estimate each ai. Theinfection numbers translate into reportedconfirmed cases and deaths after a delay. Thedelay is the sum of two independent distri-butions, assumed to be equal across countries:the incubation period and the delay fromonset of symptoms to confirmation. We putpriors over the means of both distributions,resulting in a prior over the mean infection-to-confirmation delay with amean of 10.92 days(45) (see supplementary text section A.3). Sim-ilarly, the infection-to-death delay is the sum ofthe incubation period and the delay from onsetof symptoms to death, and the prior over itsmean has amean of 21.8 days (45). Finally, as inrelated models (1, 6), both the reported casesand deaths follow a negative binomial outputdistribution with separate inferred dispersionparameters for cases and deaths.Using aMarkov chainMonte Carlo (MCMC)

sampling algorithm (46), this model infersposterior distributions of each NPI’s effec-tiveness while accounting for cross-countryvariations in effectiveness, reporting, and fatal-ity rates aswell as uncertainty in the generationinterval and delay distributions. To analyze theextent to which modeling assumptions affectthe results, our sensitivity analysis includedall epidemiological parameters, prior distri-butions, and many of the structural assump-tions introduced above. MCMC convergencestatistics are shown in fig. S19.

REFERENCES AND NOTES

1. S. Flaxman et al., Estimating the effects of non-pharmaceuticalinterventions on COVID-19 in Europe. Nature 584, 257–261(2020). doi: 10.1038/s41586-020-2405-7; pmid: 32512579

2. J. Dehning et al., Inferring change points in the spread ofCOVID-19 reveals the effectiveness of interventions.Science 369, eabb9789 (2020). doi: 10.1126/science.abb9789;pmid: 32414780

3. S. Hsiang et al., The effect of large-scale anti-contagion policieson the COVID-19 pandemic. Nature 584, 262–267 (2020).doi: 10.1038/s41586-020-2404-8; pmid: 32512578

4. S. Lai et al., Effect of non-pharmaceutical interventions tocontain COVID-19 in China. Nature 585, 410–413 (2020).doi: 10.1038/s41586-020-2293-x; pmid: 32365354

5. Y. Liu, C. Morgenstern, J. Kelly, R. Lowe, CMMIDCOVID-19 Working Group, M. Jit, The impact of non-pharmaceutical interventions on SARS-CoV-2transmission across 130 countries and territories. medRxiv2020.08.11.20172643v1 [Preprint]. 12 August 2020.https://doi.org/10.1101/2020.08.11.20172643.

6. N. Banholzer et al., Impact of non-pharmaceutical interventionson documented cases of COVID-19. medRxiv2020.04.16.20062141v3 [Preprint]. 28 April 2020.https://doi.org/10.1101/2020.04.16.20062141.

7. N. Islam et al., Physical distancing interventions and incidenceof coronavirus disease 2019: Natural experiment in 149countries. BMJ 370, m2743 (2020). doi: 10.1136/bmj.m2743;pmid: 32669358

8. X. Chen, Z. Qiu, Scenario analysis of non-pharmaceuticalinterventions on global COVID-19 transmissions. CovidEconomics 7, 46–67 (2020).

9. K. Soltesz et al., On the sensitivity of non-pharmaceuticalintervention models for SARS-CoV-2 spread estimation.medRxiv 2020.06.10.20127324 [Preprint]. 12 June 2020.https://doi.org/10.1101/2020.06.10.20127324.

10. M. Sharma et al., How robust are the estimated effects ofnonpharmaceutical interventions against COVID-19?arXiv:2007.13454 [stat.AP] (27 July 2020).

11. C. Cheng, J. Barceló, A. S. Hartnett, R. Kubinec,L. Messerschmidt, COVID-19 government responseevent dataset (CoronaNet v. 1.0). Nat. Hum. Behav. 4,756–768 (2020). doi: 10.1038/s41562-020-0909-7;pmid: 32576982

12. Oxford COVID-19 Government Response Tracker (OxCGRT)(2020); https://github.com/OxCGRT/covid-policy-tracker.

13. E. Dong, H. Du, L. Gardner, An interactive web-baseddashboard to track COVID-19 in real time. Lancet Infect. Dis.20, 533–534 (2020). doi: 10.1016/S1473-3099(20)30120-1;pmid: 32087114

14. J. M. Brauner et al., epidemics/COVIDNPIs: Inferring theeffectiveness of government interventions against COVID-19,Zenodo (2020); https://doi.org/10.5281/zenodo.4268449.

15. C. Winship, B. Western, Multicollinearity and modelmisspecification. Sociol. Sci. 3, 627–649 (2016). doi: 10.15195/v3.a27

16. EpidemicForecasting.org, Mitigation calculator (2020);http://epidemicforecasting.org/calc.

17. T. C. Jones et al., An analysis of SARS-CoV-2 viral load bypatient age. medRxiv 2020.06.08.20125484v1 [Preprint].9 June 2020. https://doi.org/10.1101/2020.06.08.20125484.

18. A. G. L’Huillier, G. Torriani, F. Pigny, L. Kaiser, I. Eckerle,Culture-competent SARS-CoV-2 in nasopharynx ofsymptomatic neonates, children, and adolescents. Emerg.Infect. Dis. 26, 2494–2497 (2020). doi: 10.3201/eid2610.202403; pmid: 32603290

19. The Independent Scientific Advisory Group for Emergencies(SAGE), The Independent SAGE Report 3, “When should aschool reopen? Final report” (2020); www.independentsage.org/wp-content/uploads/2020/06/Independent-Sage-Brief-Report-on-Schools.pdf.

20. Y. J. Park et al., Contact tracing during coronavirus diseaseoutbreak, South Korea, 2020. Emerg. Infect. Dis. 26, 2465–2468(2020). doi: 10.3201/eid2610.201315; pmid: 32673193

21. N. S. Mehta et al., SARS-CoV-2 (COVID-19): What do we knowabout children? A systematic review. Clin. Infect. Dis. 71,2469–2479 (2020). doi: 10.1093/cid/ciaa556;pmid: 32392337

22. P. Zimmermann, N. Curtis, Coronavirus infections in childrenincluding COVID-19: An overview of the epidemiology, clinicalfeatures, diagnosis, treatment and prevention options inchildren. Pediatr. Infect. Dis. J. 39, 355–368 (2020).doi: 10.1097/INF.0000000000002660; pmid: 32310621

23. Office for National Statistics, Coronavirus (COVID-19) InfectionSurvey, UK: 6 November 2020 (2020); www.ons.gov.uk/peoplepopulationandcommunity/healthandsocialcare/conditionsanddiseases/bulletins/coronaviruscovid19infectionsurveypilot/6november2020.

24. A. Aleta, Y. Moreno, Age differential analysis of COVID-19second wave in Europe reveals highest incidence amongstyoung adults. medRxiv 2020.11.11.20230177 [Preprint].13 November 2020. https://doi.org/10.1101/2020.11.11.20230177.

25. M. Levinson, M. Cevik, M. Lipsitch, Reopening primary schoolsduring the pandemic. N. Engl. J. Med. 383, 981–985 (2020).doi: 10.1056/NEJMms2024920; pmid: 32726550

Brauner et al., Science 371, eabd9338 (2021) 19 February 2021 7 of 8

RESEARCH | RESEARCH ARTICLEon A

ugust 19, 2021

http://science.sciencemag.org/

Dow

nloaded from

Page 9: Inferring the effectiveness of government While ... - Science · Brauner et al., Science 371, eabd9338 (2021) 19 February 2021 1of8 1Oxford Applied and Theoretical Machine Learning

26. J.Couzin-Frankel, G.Vogel, M.Weiland, School openings acrossglobe suggest ways to keep coronavirus at bay, despiteoutbreaks. Science 10.1126/science.abd7107 (2020).doi: 10.1126/science.abd7107

27. A. Fontanet et al., Cluster of COVID-19 in northern France:A retrospective closed cohort study. medRxiv2020.04.18.20071134 [Preprint]. 23 April 2020. https://doi.org/10.1101/2020.04.18.20071134.

28. C. Stein-Zamir et al., A large COVID-19 outbreak in a highschool 10 days after schools’ reopening, Israel, May 2020. EuroSurveill. 25, 2001352 (2020). doi: 10.2807/1560-7917.ES.2020.25.29.2001352; pmid: 32720636

29. K. Sun et al., Transmission heterogeneities, kinetics, andcontrollability of SARS-CoV-2. Science 10.1126/science.abe2424 (2020). doi: 10.1126/science.abe2424;pmid: 33234698

30. University and College Union, COVID-19 case dashboard(2020); www.ucu.org.uk/covid-dashboard.

31. T. Colbourn et al., Modelling the health and economic impactsof population-wide testing, contact tracing and isolation (PTTI)strategies for COVID-19 in the UK. SSRN 10.2139/ssrn.3627273(2020). doi: 10.2139/ssrn.3627273

32. K. Prem et al., The effect of control strategies to reduce socialmixing on outcomes of the COVID-19 epidemic in Wuhan,China: A modelling study. Lancet Public Health 5, e261–e270(2020). doi: 10.1016/S2468-2667(20)30073-6;pmid: 32220655

33. P. G. T. Walker et al., The impact of COVID-19 and strategiesfor mitigation and suppression in low- and middle-incomecountries. Science 369, 413–422 (2020). doi: 10.1126/science.abc0035; pmid: 32532802

34. N. G. Davies et al., Effects of non-pharmaceutical interventionson COVID-19 cases, deaths, and demand for hospital servicesin the UK: A modelling study. Lancet Public Health 5,e375–e385 (2020). doi: 10.1016/S2468-2667(20)30133-X;pmid: 32502389

35. X. Hao et al., Reconstruction of the full transmission dynamicsof COVID-19 in Wuhan. Nature 584, 420–424 (2020).doi: 10.1038/s41586-020-2554-8; pmid: 32674112

36. N. W. Ruktanonchai et al., Assessing the impact of coordinatedCOVID-19 exit strategies across Europe. Science 369,1465–1470 (2020). doi: 10.1126/science.abc5096;pmid: 32680881

37. The countries were selected for the availability of reliable NPI dataat the time when we started data collection and modeling(April 2020) and for their presence in at least one of the publicdatasets that we used to cross-validate our collected data. Weexcluded countries with fewer than 100 cases (or 10 deaths) by31 March, as our model neglects new cases and deaths belowthese thresholds. We also excluded a small number of countries ifthere were credible media reports casting doubt on thetrustworthiness of their reporting of cases and deaths. Finally, weexcluded very large countries such as China, the United States,and Canada for ease of data collection, as these would requiremore locally fine-grained data. Of the 41 included countries,33 are in Europe. As a result, the NPI effectiveness estimates may

be biased toward effects in Europe, and NPI effectiveness mayhave been different in other parts of the world.

38. The window of analysis extended until 2 days after the firstreopening for confirmed cases and 10 days after the first reopeningfor deaths. These durations correspond to the 5% quantiles of theinfection-to–case confirmation and infection-to-death distribu-tions, ensuring that <5% of the new infections on the reopeningday or later were observed in the window of analysis.

39. We evaluated the following datasets: the Oxford COVID-19Government Response Tracker (OxCGRT), the EpidemicForecasting Global NPI Database, and the ACAPS COVID-19GovernmentMeasures Dataset. Note that these datasets are undercontinuous development. Many of the mistakes found will alreadyhave been corrected. We know from our own experience that datacollection can be very challenging. We have the fullest respectfor the individuals behind these datasets. In this paper, we focus ona more limited set of countries and NPIs than these datasetscontain, allowing us to ensure higher data quality in this subset.Given our experience with public datasets and our data collection,we encourage fellow COVID-19 researchers to independently verifythe quality of public data they use, if feasible.

40. EpidemicForecasting.org, Epidemic forecasting global NPIdatabase (2020); http://epidemicforecasting.org/datasets.

41. T. Hale, S. Webster, A. Petherick, T. Phillips, B. Kira, OxfordCOVID-19 Government Response Tracker, Blavatnik School ofGovernment (2020); www.bsg.ox.ac.uk/research/research-projects/coronavirus-government-response-tracker.

42. S. Abbott et al., Estimating the time-varying reproductionnumber of SARS-CoV-2 using national and subnational casecounts. Wellcome Open Res. 5, 112 (2020). doi: 10.12688/wellcomeopenres.16006.2

43. S. Yadav, P. K. Yadav, Basic reproduction rate and case fatalityrate of COVID-19: Application of meta-analysis. medRxiv2020.05.13.20100750v1 [Preprint]. 16 May 2020. https://doi.org/10.1101/2020.05.13.20100750.

44. J. Wallinga, M. Lipsitch, How generation intervals shape therelationship between growth rates and reproductive numbers.Proc. Biol. Sci. 274, 599–604 (2007). doi: 10.1098/rspb.2006.3754; pmid: 17476782

45. E. S. Fonfria et al., Essential epidemiological parameters ofCOVID-19 for clinical and mathematical modeling purposes: Arapid review and meta-analysis. medRxiv2020.06.17.20133587v1 [Preprint]. 19 June 2020.https://doi.org/10.1101/2020.06.17.20133587.

46. M. D. Hoffman, A. Gelman, The No-U-Turn Sampler: Adaptivelysetting path lengths in Hamiltonian Monte Carlo. J. Mach.Learn. Res. 15, 1593–1623 (2014).

ACKNOWLEDGMENTS

We thank J. Lagerros for operational support and for introducingsome of the authors to each other. We thank M. Balatsko,M. Pukaj, and T. Witzany for developing the interactive website. Wethank T. Groemer, G. Krönke, and M. Herrmann for advice andmentorship. Funding: J.M.B. was supported by the EPSRC Centrefor Doctoral Training in Autonomous Intelligent Machines andSystems (EP/S024050/1) and by Cancer Research UK. S.M.’sfunding for graduate studies was from Oxford University and

DeepMind. M.S. was supported by the EPSRC Centre for DoctoralTraining in Autonomous Intelligent Machines and Systems(EP/S024050/1). G.L. was supported by the UKRI Centre forDoctoral Training in Interactive Artificial Intelligence (EP/S022937/1). V.M. contributed in his personal time while employed atDeepMind. L.C. acknowledges funding from the MRC Centre forGlobal Infectious Disease Analysis (MR/R015600/1), jointlyfunded by the U.K. Medical Research Council (MRC) and theU.K. Foreign, Commonwealth and Development Office (FCDO),under the MRC/FCDO Concordat agreement; is part of theEDCTP2 program supported by the European Union; andacknowledges funding by Community Jameel. Y.W.T. is alsoa principal research scientist at DeepMind. The paid contractorwork helping with the data collection, the development of theinteractive website, and the costs for cloud computing werefunded by the Berkeley Existential Risk Initiative. Authorcontributions: D.J., J.M.B., J.K., G.A., A.J.N., J.T.M., G.L., andV.M. designed and conducted the NPI data collection. S.M.,M.S., J.M.B., A.B.S., H.G., Y.W.T., Y.G., J.K., T.G., J.S., V.M.,M.A.H., and L.C. designed the model and modeling experiments.M.S., A.B.S., T.G., and J.S. performed and analyzed the modelingexperiments. J.M.B., S.M., M.S., J.K., and T.G. conceived ofthe research. S.M., M.S., J.M.B., L.C., J.K., and T.B. did theliterature search. J.M.B., S.M., M.S., G.L., L.C., T.B., and V.M.wrote the manuscript. All authors read and gave feedback onthe manuscript and approved the final manuscript. J.M.B.,S.M., and M.S. contributed equally. L.C., Y.G., and J.K.contributed equally to senior authorship. Competing interests:No conflicts of interests. L.C. has acted as a paid consultant toPfizer and the Foundation for Innovative New Diagnostics,outside of the submitted work. Y.G. has received a researchgrant (studentship) from GlaxoSmithKline, outside of thesubmitted work. J.K. has advised several governmental andnongovernmental entities about interventions against COVID-19.Data and materials availability: All data and code are availablein the paper or publicly online at (14). This work is licensedunder a Creative Commons Attribution 4.0 International (CC BY4.0) license, which permits unrestricted use, distribution, andreproduction in any medium, provided the original work isproperly cited. To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/. This license does notapply to figures/photos/artwork or other content included inthe article that is credited to a third party; obtain authorizationfrom the rights holder before using such material.

SUPPLEMENTARY MATERIALS

science.sciencemag.org/content/371/6531/eabd9338/suppl/DC1Supplementary TextFigs. S1 to S24Tables S1 to S8References (47–79)MDAR Reproducibility Checklist

21 July 2020; resubmitted 25 September 2020Accepted 8 December 2020Published online 15 December 202010.1126/science.abd9338

Brauner et al., Science 371, eabd9338 (2021) 19 February 2021 8 of 8

RESEARCH | RESEARCH ARTICLEon A

ugust 19, 2021

http://science.sciencemag.org/

Dow

nloaded from

Page 10: Inferring the effectiveness of government While ... - Science · Brauner et al., Science 371, eabd9338 (2021) 19 February 2021 1of8 1Oxford Applied and Theoretical Machine Learning

Inferring the effectiveness of government interventions against COVID-19

Besiroglu, Hong Ge, Meghan A. Hartwick, Yee Whye Teh, Leonid Chindelevitch, Yarin Gal and Jan KulveitStephenson, Gavin Leech, George Altman, Vladimir Mikulik, Alexander John Norman, Joshua Teperowski Monrad, Tamay Jan M. Brauner, Sören Mindermann, Mrinank Sharma, David Johnston, John Salvatier, Tomás Gavenciak, Anna B.

originally published online December 15, 2020DOI: 10.1126/science.abd9338 (6531), eabd9338.371Science 

, this issue p. eabd9338Sciencemodest effect in slowing transmission.businesses, and closing schools and universities were each more effective than stay-at-home orders, which were of curtailing transmission during the early pandemic. Limiting gatherings to fewer than 10 people, closing high-exposurecountries as input to a model to identify the individual nonpharmaceutical interventions that were the most effective at

amassed and curated data from 41et al.accumulated to discern the effectiveness of individual interventions. Brauner countries by imposing combinations of nonpharmaceutical interventions. Sufficient data on transmission have now

Early in 2020, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) transmission was curbed in manyHow to hold down transmission

ARTICLE TOOLS http://science.sciencemag.org/content/371/6531/eabd9338

MATERIALSSUPPLEMENTARY http://science.sciencemag.org/content/suppl/2020/12/15/science.abd9338.DC1

CONTENTRELATED

http://stm.sciencemag.org/content/scitransmed/12/549/eabb9401.fullhttp://stm.sciencemag.org/content/scitransmed/12/573/eabe2555.fullhttp://stm.sciencemag.org/content/scitransmed/12/568/eabe0948.full

REFERENCES

http://science.sciencemag.org/content/371/6531/eabd9338#BIBLThis article cites 57 articles, 11 of which you can access for free

PERMISSIONS http://www.sciencemag.org/help/reprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAAS.ScienceScience, 1200 New York Avenue NW, Washington, DC 20005. The title (print ISSN 0036-8075; online ISSN 1095-9203) is published by the American Association for the Advancement ofScience

Science. No claim to original U.S. Government WorksCopyright © 2020 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of

on August 19, 2021

http://science.sciencem

ag.org/D

ownloaded from